bitextor / bicleaner

Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
GNU General Public License v3.0
150 stars 22 forks source link

Experiments on WMT 2018 shared task setup #18

Closed phikoehn closed 5 years ago

phikoehn commented 5 years ago

I trained a few models for en-de and tested it on the WMT 2018 de-en shared task setup.

BLEU-c/SMT . . . . . . . . 100m _10m __1m PROMPT-LM submission . . . 31.1 25.4 JHU Zipporah submission. . 30.2 26.3 provided model . . . . . . 29.6 23.3 18.8 nc . . . . . . . . . . . . 27.1 26.3 22.5 wmt. . . . . . . . . . . . 28.0 26.7 22.3 wmt-cc . . . . . . . . . . 30.7 26.3 21.3 wmt bad-paracrawl. . . . . 30.0 27.5 21.2 wmt-cc bad-paracrawl . . . 30.6 27.4 21.2

I will also run NMT models but these may take a while.

I generally get good numbers but not with the provided model.

Any advice on how to train this differently, please let me know.

mbanon commented 5 years ago

Probably @vitaka is the most suitable person to answer this.

vitaka commented 5 years ago

Hi Philipp,

Thanks for reporting this.

WMT 2018 exact results are difficult to reproduce with the current version of Bicleaner because of two main reasons:

Concerning the differences with the provided model (which I guess is the latest one released), the only difference that comes to my mind is that the probabilistic dictionaries were extracted directly from Opus. Here Prompsit's people can provide more detailed information since I did not (completely) take part in the last release.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 5 years ago

This issue has been automatically closed because it has not had recent activity. Thank you for your contributions.