Open svupper opened 1 year ago
Hi @svupper,
Meta's LLaMA model has been trained on a massive amount of data - 1.0T/1.4T tokens on 2048 A100s (80GB) over a period of 5 months. Continuing the pre-training of the LLaMA model on a French corpus is definitely a promising approach to improve its performance on the French language. However, this option is still quite expensive and may require significant computational resources. I'm currently pre-training it on a small French dataset to see if it improves a lot. Stay tuned!
Creating a French Llama version by translating RedPajama dataset