microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
https://arxiv.org/abs/2106.09685
MIT License
10.62k stars 678 forks source link

Some errors in your generation code, but if you use the code from huggingface, it works well #60

Open zibojia opened 1 year ago

zibojia commented 1 year ago

Dear Edward,

Thanks for your contribution to the community.

But I couldn't re-implement your experiments by using the scripts you wrote in the LoRA/examples/NLG.

I feel down and don't know what to do.

......

:(

edwardjhu commented 1 year ago

I'm sorry to hear that! Were you able to run our checkpoints and reproduce our reported numbers?

It's been a while since we released the repo. Many factors could have contributed to your not being able to get the same result. Unfortunately, I don't have the bandwidth to help you hunt down what the issues are in your experiment, but maybe getting our checkpoints to run first can help you debug things like the eval pipeline.

zibojia commented 1 year ago

Thanks for your reply! I have tested your pretrained model, and I found that some errors may occur in the beam search and decode process.

zibojia commented 1 year ago

Sadly, after occupying with your codes for a couple of days. I STILL COULDN'T reimplement your results, even though I use your pretrained model.

edwardjhu commented 1 year ago

I see! Since many others were able to reproduce the numbers using our pretrained checkpoints, it is fairly certain that there are bugs in your code. I don't have any specific advice for you, but it might be a good idea to reach out to others who have successfully reproduced our checkpoints (see other GitHub issues).

zibojia commented 1 year ago

After struggling of a couple of weeks, I successfully re-implement your paper! Thanks for your contribution to the community

Jai12396 commented 1 year ago

@zibojia can you please tell how did you use the pretrained models? From where can we access the pretrained models? Do we have to run the training to get the trained model?

RayCyder commented 5 months ago

After struggling of a couple of weeks, I successfully re-implement your paper! Thanks for your contribution to the community

hi, can you share your work details ,since i can not reimplement experiments in paper, related issue:https://github.com/microsoft/LoRA/issues/138