TsinghuaC3I / SoRA

The source code of the EMNLP 2023 main conference paper: Sparse Low-rank Adaptation of Pre-trained Language Models.
69 stars 9 forks source link

LoRA beseline parameter #4

Open jypppppp opened 10 months ago

jypppppp commented 10 months ago

Hi,

Thanks for your good work!

Can you clarify what is learning rate,bsz and epochs for baseline LoRA experiments among different datasets

Kind regards,

Jason

telxt commented 10 months ago

Thank you for your interest in our work! The hyper-parameters of LoRA is listed in the following:

Dataset | lr | epoch -- | -- | -- CoLA | 8e-5 | 20 SST-2 | 1e-4 | 10 MRPC | 1e-4 | 20 QQP | 3e-4 | 10 STS-B | 1e-4 | 20 MNLI | 3e-4 | 10 QNLI | 3e-4 | 10 RTE | 1.2e-3 | 50

And the seed list is {0, 21, 42, 81, 100}, the batch_size is 8.

I hope my response helps you.

ouxinwei111 commented 9 months ago

Hi, I was wondering if you use the same learning rate for all the rank settings. Looking forward to your help :)