Guidance Request for Reproducing OpenbookQA Dataset Results

AGI-Edgerunners / LLM-Adapters

Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"

https://arxiv.org/abs/2304.01933

Apache License 2.0

1.08k stars 103 forks source link

Guidance Request for Reproducing OpenbookQA Dataset Results #49

Open FairyFali opened 12 months ago

FairyFali commented 12 months ago

Hi,

I am encountering difficulties in reproducing the experimental results on the OpenbookQA dataset. The output format is unexpected; for instance, I'm getting responses like "1 is correct. 2 is incorrect. 3 is incorrect. 4 is incorrect.", whereas the anticipated format should be "answer1". Could you please provide a detailed command or set of instructions for both fine-tuning and evaluating the model, to enable accurate reproduction of the results on OpenbookQA?

HZQ950419 commented 12 months ago

Hi,

I have replied to your email in case you haven't seen it. You can refer to issue #38 and we use commonsense_evaluate.py for evaluation. BTW, please try to use single GPU for training, multi-GPU training may not reproduce the results. We are still trying to figure out the reason.

If you have further questions, please let us know!