cannot reproduce the baseline score of question answering with transformers v4.19.2

yahoojapan / JGLUE

JGLUE: Japanese General Language Understanding Evaluation

Creative Commons Attribution Share Alike 4.0 International

302 stars 19 forks source link

cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

Open kumapo opened 1 year ago

kumapo commented 1 year ago

I tried to reproduce the baseline score with run_squad.py parameters you provided and patched transformers v4.19.2. but the result score in eval_results.json is quite low compared to the baseline.

    "exact": 42.30076542098154,
    "f1": 42.390814948221525,

based on fune-tuning/README.md, I think you confirmed that transformers v4.19.2 worked. How was the score then?

I'm attaching the requirements.txt and eval_results.json when I tested with transformers v4.19.2.

tomohideshibata commented 1 year ago

Thank you for your report. I will check it.

Which pretrained model have you used?

kumapo commented 1 year ago

@tomohideshibata

Thank you for quick reply. I used cl-tohoku/bert-base-japanese-v2.