cmnfriend / O-LoRA

MIT License
125 stars 12 forks source link

llama2 结果复现 #23

Open chengshuang18 opened 3 weeks ago

chengshuang18 commented 3 weeks ago

感谢作者的工作,提供了一个解决 cl 灾难性遗忘的思路。 我采用 codebase 提供的 llama2 的脚本,跑出来的结果直接坏掉了,这是什么原因呢,跑实验的过程中,有什么要点需要注意么,或者参数设置上需要做些什么调整呢?是 olora 的 lamda 参数设置太小导致过多的遗忘么?下面是我在 tune order2 时的逐 task 结果 predict metrics epoch = 1.0 predict_exact_match = 97.6184 predict_exact_match_for_TC = 97.6184 predict_exact_match_for_dbpedia = 97.6184 predict metrics epoch = 1.0 predict_exact_match = 43.2171 predict_exact_match_for_SC = 52.9868 predict_exact_match_for_TC = 33.4474 predict_exact_match_for_amazon = 52.9868 predict_exact_match_for_dbpedia = 33.4474 predict metrics epoch = 1.0 predict_exact_match = 26.8114 predict_exact_match_for_SC = 3.3289 predict_exact_match_for_TC = 38.5526 predict_exact_match_for_amazon = 3.3289 predict_exact_match_for_dbpedia = 10.2105 predict_exact_match_for_yahoo = 66.8947 predict metrics epoch = 0.99 predict_exact_match = 35.3191 predict_exact_match_for_SC = 25.5132 predict_exact_match_for_TC = 38.5877 predict_exact_match_for_agnews = 87.4868 predict_exact_match_for_amazon = 25.5132 predict_exact_match_for_dbpedia = 19.1447 predict_exact_match_for_yahoo = 9.1316

cmnfriend commented 3 weeks ago

我估计8卡和其他数量的显卡用同一套超参效果差别会比较大,我们是8卡跑的,如果是硬件原因的话需要你调一下参,可以试一下提高lambda1, lambda2或者降低学习率

------------------ 原始邮件 ------------------ 发件人: "cmnfriend/O-LoRA" @.>; 发送时间: 2024年6月5日(星期三) 中午11:43 @.>; @.***>; 主题: [cmnfriend/O-LoRA] llama2 结果复现 (Issue #23)

感谢作者的工作,提供了一个解决 cl 灾难性遗忘的思路。 我采用 codebase 提供的 llama2 的脚本,跑出来的结果直接坏掉了,这是什么原因呢,跑实验的过程中,有什么要点需要注意么,或者参数设置上需要做些什么调整呢?是 olora 的 lamda 参数设置太小导致过多的遗忘么?下面是我在 tune order2 时的逐 task 结果 predict metrics epoch = 1.0 predict_exact_match = 97.6184 predict_exact_match_for_TC = 97.6184 predict_exact_match_for_dbpedia = 97.6184 predict metrics epoch = 1.0 predict_exact_match = 43.2171 predict_exact_match_for_SC = 52.9868 predict_exact_match_for_TC = 33.4474 predict_exact_match_for_amazon = 52.9868 predict_exact_match_for_dbpedia = 33.4474 predict metrics epoch = 1.0 predict_exact_match = 26.8114 predict_exact_match_for_SC = 3.3289 predict_exact_match_for_TC = 38.5526 predict_exact_match_for_amazon = 3.3289 predict_exact_match_for_dbpedia = 10.2105 predict_exact_match_for_yahoo = 66.8947 predict metrics epoch = 0.99 predict_exact_match = 35.3191 predict_exact_match_for_SC = 25.5132 predict_exact_match_for_TC = 38.5877 predict_exact_match_for_agnews = 87.4868 predict_exact_match_for_amazon = 25.5132 predict_exact_match_for_dbpedia = 19.1447 predict_exact_match_for_yahoo = 9.1316

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

chengshuang18 commented 3 weeks ago

感谢及时的回复,不过我也是采用 8 卡跑的诶,上面的结果是采用 8 卡 A100 40G的显卡跑出的结果 @cmnfriend

chengshuang18 commented 3 weeks ago

还有一个问题,你们采用的是 llama2_chat 还是 llama2 啊