Question Regarding Reproducing LLaMA Results

cmnfriend / O-LoRA

MIT License

139 stars 16 forks source link

Question Regarding Reproducing LLaMA Results #32

Open JohannaOm opened 2 weeks ago

JohannaOm commented 2 weeks ago

Hallo,

Thank you for your excellent work and for making your code available.

I’m trying to reproduce your LLaMA results using the scripts_llama/order_1.sh script, but I'm getting subpar results. For DBpedia I was expecting an exact-match accuracy of about 98%, but I’m only achieving a maximum of 91%. This issue persists even when DBpedia is trained as the first task.

Do you have any suggestions on what might be causing this problem?

Thank you in advance.

cmnfriend commented 3 hours ago

Sorry for late response. We found that the results might be different if the number of GPU is different, although we kept the global batch size same (We used 8 A100s in our experiments). By the way, I guess reducing the learning rate (1e-04, for example) may help reach the expected performance.