EVAL-MGSM-Langbridge - Githubissues

zwRuan commented 2 months ago

Thank you for this wonderful work. Is there a problem with the evaluation method? I used scripts/eval/mgsm/metamath-lb-9B.bash（checkpoint：kaist-ai/metamath-langbridge-9b and kaist-ai/langbridge_encoder_tokenizer）, but the eval result is this and very lower	Version	Metric	Value
mgsm_bn	acc	0.048	±	0.0135
mgsm_de	acc	0.104	±	0.0193
mgsm_en	acc	0.128	±	0.0212
mgsm_es	acc	0.096	±	0.0187
mgsm_fr	acc	0.100	±	0.0190
mgsm_ja	acc	0.052	±	0.0141
mgsm_ru	acc	0.052	±	0.0141
mgsm_sw	acc	0.028	±	0.0105
mgsm_te	acc	0.036	±	0.0118
mgsm_th	acc	0.072	±	0.0164
mgsm_zh	acc	0.048	±	0.0135

And if I want to train and evaluate using this method, should I add the save of lm.tokenizer and enc.tokenizer in train_langbridge?

Kosei1227 commented 2 months ago

This score table is quite similar to the one I got previously. Could you try to use the transformer version specified in the requirement.txt? Hopefully, this will fix your issue.

MattYoon commented 2 months ago

Thank you for reporting, please refer to https://github.com/kaistAI/LangBridge/issues/11

kaistAI / LangBridge

EVAL-MGSM-Langbridge #14