mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
155 stars 34 forks source link

[meta] General translation quality improvements #216

Open eu9ene opened 1 year ago

eu9ene commented 1 year ago

This is a meta issue for brainstorming ideas and tracking issues to improve the translation quality in general:

### General improvements
- [ ] https://github.com/mozilla/firefox-translations-training/issues/843
- [ ] https://github.com/mozilla/firefox-translations-training/issues/178
- [ ] https://github.com/mozilla/firefox-translations-training/issues/231
- [ ] https://github.com/mozilla/firefox-translations-training/issues/89
- [ ] https://github.com/mozilla/firefox-translations-training/issues/172
- [ ] https://github.com/mozilla/firefox-translations-training/issues/174
- [ ] https://github.com/mozilla/firefox-translations-training/issues/92
- [ ] https://github.com/mozilla/firefox-translations-training/issues/592
- [ ] https://github.com/mozilla/firefox-translations-training/issues/730
- [ ] https://github.com/mozilla/firefox-translations-training/issues/248
- [ ] https://github.com/mozilla/firefox-translations-training/issues/789
- [ ] https://github.com/mozilla/firefox-translations-training/issues/844
- [ ] https://github.com/mozilla/firefox-translations-training/issues/472
- [ ] https://github.com/mozilla/firefox-translations-training/issues/181
- [ ] https://github.com/mozilla/firefox-translations-training/issues/186
- [ ] https://github.com/mozilla/firefox-translations-training/issues/390
- [ ] https://github.com/mozilla/firefox-translations-training/issues/507
### Better cleaning
- [ ] https://github.com/mozilla/firefox-translations-training/issues/210
- [ ] https://github.com/mozilla/firefox-translations-training/issues/370
- [ ] https://github.com/mozilla/firefox-translations-training/issues/528
- [ ] https://github.com/mozilla/firefox-translations-training/issues/476
- [ ] https://github.com/mozilla/firefox-translations-training/issues/581
- [ ] https://github.com/mozilla/firefox-translations-training/issues/559
- [ ] https://github.com/mozilla/firefox-translations-training/issues/274
- [ ] https://github.com/mozilla/firefox-translations-training/issues/257
- [ ] https://github.com/mozilla/firefox-translations-training/issues/247
- [ ] https://github.com/mozilla/firefox-translations-training/issues/53
- [ ] https://github.com/mozilla/firefox-translations-training/issues/26
- [ ] https://github.com/mozilla/firefox-translations-training/issues/758
- [ ] https://github.com/mozilla/firefox-translations-training/issues/888
- [ ] https://github.com/mozilla/firefox-translations-training/issues/910
### More data
- [ ] https://github.com/mozilla/firefox-translations-training/issues/235
- [ ] https://github.com/mozilla/firefox-translations-training/issues/399
- [ ] https://github.com/mozilla/firefox-translations-training/issues/324
- [ ] https://github.com/mozilla/firefox-translations-training/issues/323
- [ ] https://github.com/mozilla/firefox-translations-training/issues/286
- [ ] https://github.com/mozilla/firefox-translations-training/issues/252
- [ ] https://github.com/mozilla/firefox-translations-training/issues/76
- [ ] https://github.com/mozilla/firefox-translations-training/issues/74
- [ ] https://github.com/mozilla/firefox-translations-training/issues/537
- [ ] https://github.com/mozilla/firefox-translations-training/issues/766
### Evals
- [ ] https://github.com/mozilla/firefox-translations-training/issues/503
- [ ] https://github.com/mozilla/firefox-translations-training/issues/228
- [ ] https://github.com/mozilla/firefox-translations-training/issues/229
marco-c commented 1 year ago

Identify other classes of quality problems by comparing translations with a “good” known one and sort by BLEU

https://github.com/neulab/compare-mt could be another alternative to investigate classes of quality problems.

marco-c commented 1 year ago

Identify other classes of quality problems by comparing translations with a “good” known one and sort by BLEU

https://github.com/neulab/compare-mt could be another alternative to investigate classes of quality problems.

Filed #228.

marco-c commented 1 year ago

See also #238.

marco-c commented 6 months ago
  • Using some sort of “fuzzing”/”genetic algorithm” to choose the rules (where the oracle is a LLM)

Interesting approach partially related to this: https://huggingface.co/papers/2309.08532.