Unable to Reproduce ALMA-7b-LoRA Performance, Seeking Assistance

fe1ixxu / ALMA

State-of-the-art LLM-based translation models.

MIT License

440 stars 35 forks source link

Unable to Reproduce ALMA-7b-LoRA Performance, Seeking Assistance #58

Open liangyingshao opened 3 months ago

liangyingshao commented 3 months ago

Thank you for your excellent work.

While fine-tuning the ALMA-7b-Pretrain model and testing with the checkpoint you provided, I was unable to reproduce the performance of ALMA-7b-LoRA as described in the paper. I would appreciate any guidance or suggestions you could offer. I used the code, data, and scripts provided in this repository (including runs/parallel_ft_lora.sh and evals/alma_7b_lora.sh), with a training batch size of 256 and four V100 GPUs.

Please feel free to ask if you need more details about my experiments.

fe1ixxu commented 3 months ago

Thanks for your interest! Could you please provide the transformer, accelerate, and deepspeed version? And also, could you please provide the results you got?

liangyingshao commented 3 months ago

Thanks for your interest! Could you please provide the transformer, accelerate, and deepspeed version? And also, could you please provide the results you got? I use transformers==4.33.0, accelerate==0.33.0, deepspeed==0.14.4 Here are the results of mine:

fe1ixxu commented 3 months ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf
pip3 install deepspeed==0.13.1
pip install accelerate==0.27.2
pip install peft==0.5.0
Re-evaluate the checkpoint

Hope this is helpful

liangyingshao commented 3 months ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf

pip3 install deepspeed==0.13.1

pip install accelerate==0.27.2

pip install peft==0.5.0

Re-evaluate the checkpoint

Hope this is helpful

Thank you for your suggestion! I will try it out and provide feedback in this issue.

liangyingshao commented 3 months ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf

pip3 install deepspeed==0.13.1

pip install accelerate==0.27.2

pip install peft==0.5.0

Re-evaluate the checkpoint

Hope this is helpful

I try your suggestion, and it does lead to some performance improvement. However, the reproduced performance still falls short of what the paper reports. Could you advise on any other potential factors that might affect the performance? Any further suggestions for improvement would be greatly appreciated.

liangyingshao commented 3 months ago

By the way, could you please provide the versions of the datasets, tokenizer, and huggingface-hub that you are using?