grammatical / neural-naacl2018

Neural models and instructions on how to reproduce our results for our neural grammatical error correction systems from M. Junczys-Dowmunt, R. Grundkiewicz, S. Guha, K. Heafield: Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task, NAACL 2018.
MIT License
88 stars 15 forks source link

Why is the M2 score in models/README.md different from that in your paper ? #2

Open h-asano opened 5 years ago

h-asano commented 5 years ago

Your reported M2 score on CoNLL2014 is 57.53. In your paper, the M2 score is 55.8.

snukky commented 5 years ago

The published system is not exactly the same system we trained for the paper as we have lost the original models and config files. I reconstructed the system with a newer version of Marian, and there are several reasons why M2 scores are different:

So these are changes that someone could make while reconstructing our systems from scratch using the same data. The training data, subword segmentation codes, and vocabularies are exactly the same.

h-asano commented 5 years ago

Thank you very much !