fe1ixxu / ALMA

State-of-the-art LLM-based translation models.
MIT License
388 stars 28 forks source link

Unable to Reproduce ALMA-7b-LoRA Performance, Seeking Assistance #58

Open liangyingshao opened 2 weeks ago

liangyingshao commented 2 weeks ago

Thank you for your excellent work.

While fine-tuning the ALMA-7b-Pretrain model and testing with the checkpoint you provided, I was unable to reproduce the performance of ALMA-7b-LoRA as described in the paper. I would appreciate any guidance or suggestions you could offer. I used the code, data, and scripts provided in this repository (including runs/parallel_ft_lora.sh and evals/alma_7b_lora.sh), with a training batch size of 256 and four V100 GPUs.

Please feel free to ask if you need more details about my experiments.

fe1ixxu commented 2 weeks ago

Thanks for your interest! Could you please provide the transformer, accelerate, and deepspeed version? And also, could you please provide the results you got?

liangyingshao commented 2 weeks ago

Thanks for your interest! Could you please provide the transformer, accelerate, and deepspeed version? And also, could you please provide the results you got? I use transformers==4.33.0, accelerate==0.33.0, deepspeed==0.14.4 Here are the results of mine: 25D3A016-73F7-4863-A017-5C4493B89CC7 D6B9FEE4-BB06-48A7-B5F1-7CA524DFC84F

fe1ixxu commented 2 weeks ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

  1. pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf
  2. pip3 install deepspeed==0.13.1
  3. pip install accelerate==0.27.2
  4. pip install peft==0.5.0
  5. Re-evaluate the checkpoint

Hope this is helpful

liangyingshao commented 2 weeks ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

  1. pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf
  2. pip3 install deepspeed==0.13.1
  3. pip install accelerate==0.27.2
  4. pip install peft==0.5.0
  5. Re-evaluate the checkpoint

Hope this is helpful

Thank you for your suggestion! I will try it out and provide feedback in this issue.

liangyingshao commented 2 weeks ago

It looks like the results you got are very close to the checkpoint we released under the same virtual env. I suspect the main issue could come from the version mismatch.

Please try uninstall transformers, deepspeed and accelerate and reinstall them by

  1. pip install git+https://github.com/fe1ixxu/ALMA.git@alma-r-hf
  2. pip3 install deepspeed==0.13.1
  3. pip install accelerate==0.27.2
  4. pip install peft==0.5.0
  5. Re-evaluate the checkpoint

Hope this is helpful

I try your suggestion, and it does lead to some performance improvement. image image However, the reproduced performance still falls short of what the paper reports. Could you advise on any other potential factors that might affect the performance? Any further suggestions for improvement would be greatly appreciated.

liangyingshao commented 2 weeks ago

By the way, could you please provide the versions of the datasets, tokenizer, and huggingface-hub that you are using?