MiuLab / Taiwan-LLM

Traditional Mandarin LLMs for Taiwan
https://twllm.com
Apache License 2.0
1.23k stars 102 forks source link

有關 Taiwan-LLM-7B-v2.1-chat 的 base model #45

Closed larry0220 closed 9 months ago

larry0220 commented 10 months ago

Hugging face 的 yentinglin/Taiwan-LLM-7B-v2.1-chat 寫著 Taiwan LLM based on Mistral-7B-v0.1 可是 config.json 卻顯示 "architectures": [   "LlamaForCausalLM"  ]

adamlin120 commented 9 months ago

原本是訓練一個 mistral 的模型,不過當時用 FSDP 造成 loss 不穩定 ,後來就改用 llama-2 了 :)

DhruvaBansal00 commented 8 months ago

@adamlin120 did you train the llama-2 model using FSDP or Deepspeed?