yuwfan / FILTER

MIT License
21 stars 5 forks source link

about MLQA_zh_results #2

Closed howardchenhd closed 3 years ago

howardchenhd commented 3 years ago

Hi, thanks for the code! I am trying to replicate MLQA scores (XLMR translate train multi-task) using this repo's code, but the score on zh language is wrong while the other is right.

'test_en': {'exact_match': 70.20707506471096, 'f1': 83.69701042191264}, 'test_es': {'exact_match': 57.35770036169808, 'f1': 75.17824917711975}, 'test_de': {'exact_match': 54.903697144122205, 'f1': 70.18311703583281}, 'test_ar': {'exact_match': 47.403936269915654, 'f1': 67.90403139343402}, 'test_hi': {'exact_match': 53.395689304595365, 'f1': 71.87824404813004}, 'test_vi': {'exact_match': 53.7761601455869, 'f1': 74.26649611206571}, 'test_zh': {'exact_match': 5.606385049639868, 'f1': 18.790755684164417},

Unlike Xtreme repo, I found you already change if 'zh' in output_prediction_file: final_text = tok_text in squad_metrics.py.

yuwfan commented 3 years ago

Close this issue as you used the wrong squad_metrics.py according to your email.

orionw commented 1 year ago

Hi, would you mind sharing the hyperparameters used for the XLMR translate train multi-task, if you still have them? Sorry, I know this is an old repository, but if you still have that email I'd be interested in knowing. Thanks!