Closed tfka closed 1 year ago
65B is difficult to train so the exact date is not clear.
We plan to first release 13B models, which should come before next Mar 14. And then 30B models. According to my experience, training 30B models will require around 2 weeks so it is expected to be released before Jun.
65B model currently cannot be fitted into single A100 so we need some tricks, for example, model parallel, but there is no efficient implementation of LLaMA. So we cannot determine a exact date.
Thanks.
Is there a plan to release the 65b-pandallm