microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
https://arxiv.org/abs/2106.09685
MIT License
10.78k stars 688 forks source link

Can't reproduce the results for GLUE and hyperparameter misalignment #149

Open nbasyl opened 12 months ago

nbasyl commented 12 months ago

Hi, Thanks for the great work.

I am trying to reproduce the result of Roberta-large on the NLU tasks, however, I got a CoLA score = 0 and MNLI = 31.3 using the provided finetuning scripts, and then I found out that there are misalignments between the hyperparameters in the provided training scripts and those on the paper. For example, in roberta_large_cola.sh the lr is set to 3e-4, but in the paper, it is set to 2e-4. Which settings should I follow to reproduce the reported result?

looking forward to your reply!

Best, Sean

nbasyl commented 12 months ago

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

nbasyl commented 12 months ago

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

zxchasing commented 8 months ago

But I am still only getting 62.82 CoLA score, anyone encountered similar problem when trying to reproduce the result

Hi,Did you solve this problem?

Car-pe commented 7 months ago

I changed the lr in the CoLA training script to 2e-4 and solved the CoLA constant 0 eval correlation value problem, but still couldn't reproduce the MNLI result :(

My result in CoLA is 63.48 which matches the paper. And the random seeds used are (1 3 13 37 71), but I can not reproduce other task, only CoLA can match the paper.