issues
search
mozilla
/
translations
The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154
stars
33
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Investigate using LLMs to generate training data
#767
marco-c
opened
3 months ago
0
Integrate datasets used for LLM training as monolingual datasets
#766
marco-c
opened
3 months ago
2
English to Serbian has low quality of the teacher models
#765
eu9ene
opened
3 months ago
5
Improve automatic quality evaluation
#764
eu9ene
opened
3 months ago
3
Process alignments in chunks
#763
eu9ene
closed
3 months ago
0
Increase disk size for merge-translated
#762
eu9ene
closed
3 months ago
0
fix: invalidate caches when fetch, docker, or toolchain tasks change
#761
bhearsum
closed
3 months ago
1
merge-translated-el-en keeps restarting
#760
eu9ene
closed
3 months ago
9
Add a link to W&B dashboard
#759
eu9ene
closed
3 months ago
0
Add a cleaning rule for URL names, such as Amazon.com -> Amazon.it
#758
gregtatum
opened
3 months ago
2
Retrain old models with robustness fixes
#757
gregtatum
opened
3 months ago
1
English to Lithuanian did not meet our quality bar
#756
gregtatum
opened
3 months ago
4
Delete temporary files after successfully generating alignments
#755
bhearsum
closed
3 months ago
2
Multiply comet metric by 100 before publication
#754
La0
closed
3 months ago
0
Check shortlist for CJK
#753
eu9ene
opened
3 months ago
0
Support data import for CJK
#752
eu9ene
opened
3 months ago
0
Check alignments for CJK
#751
eu9ene
opened
3 months ago
1
Investigate OpusTrainer compatibility for CJK
#750
eu9ene
opened
3 months ago
4
Check Bicleaner-AI models for CJK
#749
eu9ene
closed
1 week ago
1
Check decoding for CJK
#748
eu9ene
opened
3 months ago
1
Check training for CJK
#747
eu9ene
opened
3 months ago
3
Check evaluation procedure for CJK
#746
eu9ene
opened
3 months ago
2
Investigate issues with SentencePiece vocabulary for CJK
#745
eu9ene
opened
3 months ago
5
Implement corpus specific fixes for CJK
#744
eu9ene
opened
3 months ago
3
Support dataset desegmentation for CJK
#743
eu9ene
opened
3 months ago
0
Support CJK in OpusCleaner
#742
eu9ene
opened
3 months ago
3
Implement convertion between Chinese Traditional and Simplified
#741
eu9ene
opened
3 months ago
3
Support CJK in find_corpus and config generator
#740
eu9ene
opened
3 months ago
0
Process alignments in chunks
#739
eu9ene
opened
3 months ago
2
chore: bump taskgraph to 9.2.0
#738
bhearsum
closed
3 months ago
2
Improve translation of social posts
#737
eu9ene
opened
3 months ago
0
Improve translation of URLs
#736
eu9ene
opened
3 months ago
4
Create Sardinian config
#735
gregtatum
closed
2 months ago
0
W&B runs published from GCP experiments should be suffixed with the Task Group ID when possible
#734
vrigal
closed
1 month ago
2
Evaluate other GPU types
#733
eu9ene
opened
3 months ago
0
fix: pre-download fast text model in bicleaner.sh
#732
gabrielBusta
opened
3 months ago
1
COMET results are not visible on custom charts
#731
eu9ene
closed
3 months ago
1
Consider using backward-forward translation for knowledge distillation
#730
eu9ene
opened
3 months ago
0
Duplicate runs in W&B
#729
eu9ene
closed
3 months ago
2
start_stage often reruns amost all "evaluate" tasks
#728
eu9ene
opened
3 months ago
4
Use unique run names in Weight & Biases
#727
vrigal
closed
3 months ago
0
Support multilingual models
#726
eu9ene
closed
3 months ago
1
debugging: use d2g on bhearsum's special worker type
#725
bhearsum
opened
3 months ago
0
GPUs stopped working on bicleaner-ai for id-en
#724
eu9ene
closed
3 months ago
1
One of the teachers for el-en diverged
#723
eu9ene
closed
3 months ago
1
Add a check that there are visible GPUs
#722
gregtatum
closed
3 months ago
0
Student alignments fail for en-uk
#721
eu9ene
closed
3 months ago
1
Publish Marian/OpusTrainer configuration YAMLs and dataset statistics
#720
vrigal
closed
1 month ago
15
Improve usability of running selected tasks
#719
eu9ene
opened
4 months ago
3
Do not use aggressive dash splitting in tokenization
#718
eu9ene
closed
4 months ago
0
Previous
Next