Closed aixiaodewugege closed 1 year ago
The training code has already release, but I haven't updated the readme yet. Our pretraining data and fine-tuning data will be released after the final cleaning, before next week.
Hi, thanks for sharing the training code!
Can I finetune your model with 4 3090? And is it easy for me to let your model can be inferenced on 4 3090? Or any suggestion about it?
Because of the lack of memory, it is not possible to train our model on 4 x 3090s. Training a 13B model requires at least 8 x A100s, and adopts a model sharding training method (like deepspeed's ZERO3). During inference, a single 3090 cannot hold our model, but we can use model parallelism and model sharding for inference. When inferencing, a single card needs at least 26G of memory. We are also working on a 7B model, which will be released later.
Thanks for your brilliant work!
When will you planning to release the data and code?