mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Check decoding for CJK #748

Open eu9ene opened 3 months ago

eu9ene commented 3 months ago

Does decoding, extract-best and other procedures for translation work the same way for CJK?

ZJaume commented 1 week ago

The only thing that comes to my mind is extract-best using BLEU and BLEU being worse for Chinese than other languages, since it relies on some tokenization.

I personally changed to chrF for all languages for bestbley.py, since it correlates better with human judgments and has more resolution.