VishnuPJ / MalayaLLM

A Continually LoRA PreTrained and FineTuned 7B Llama-2 Indic model for Malayalam Language.
42 stars 4 forks source link

Hardware Requirement #2

Closed ForestsKing closed 4 months ago

ForestsKing commented 4 months ago

Hello, I have also recently been trying to transfer Llama to a new language. I'm glad to discover your work, and I've been looking for an easy to read tutorial for a long time. I would like to ask you about the GPU you used to carry out this project. Moreover, I saw your Issue in unsloth. Due to limited resources, I would also like to take advantage of unsloth. I would be especially grateful if you could offer assistance. Thanks!

VishnuPJ commented 4 months ago

Hi @ForestsKing , for llama2 I have used A100 40GB GPU. I have also tried with 4 x T4 24-GB GPU's. But it will take a really long time even with 4 GPU. Unsloth is a great tool, but they don't provide multiple GPU support for free version. But this version of Llama-2 is trained using transformer and TRL library. I am currently training Llama-3 using Unsloth. Will release the new version soon.

ForestsKing commented 4 months ago

Will the project with Llama3 also involve vocabulary expansion, pre-training, and fine-tuning? If so, it's exactly what I need! What's the approximate time of release for the new version? If possible could you provide the code first, I can't wait to study it. Thanks!

ForestsKing commented 4 months ago

Does unsloth not support pre-training?

VishnuPJ commented 4 months ago

Will the project with Llama3 also involve vocabulary expansion, pre-training, and fine-tuning? If so, it's exactly what I need! What's the approximate time of release for the new version? If possible could you provide the code first, I can't wait to study it. Thanks!

Yeah I am first creating own tokenizer specific for my language which includes tokenization and vocabulary addition. Then "continued pre-training" followed by finetuning. Currently I am facing some challenges with the compute as it is hard to get A100 these days. I am willing to share the code , but can we discuss further in DM.

VishnuPJ commented 4 months ago

Does unsloth not support pre-training?

Yes. Pretraining is nothing but next token prediction. Unsloth has provided some notebooks to do exactly that.