dandelionsllm / pandallm

Panda项目是于2023年5月启动的开源海外中文大语言模型项目,致力于大模型时代探索整个技术栈,旨在推动中文自然语言处理领域的创新和合作。
Apache License 2.0
1.06k stars 91 forks source link

65b-model #8

Closed tfka closed 1 year ago

tfka commented 1 year ago

Is there a plan to release the 65b-pandallm

SparkJiao commented 1 year ago

65B is difficult to train so the exact date is not clear.

We plan to first release 13B models, which should come before next Mar 14. And then 30B models. According to my experience, training 30B models will require around 2 weeks so it is expected to be released before Jun.

65B model currently cannot be fitted into single A100 so we need some tricks, for example, model parallel, but there is no efficient implementation of LLaMA. So we cannot determine a exact date.

Thanks.