Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
312 stars 39 forks source link

Multilingual Tuned Model Translating everything to "sssssssss" #79

Open hdeval1 opened 1 year ago

hdeval1 commented 1 year ago

I was able to successfully tune a multilingual model utilizing data for one of the included languages. Unfortunately, when the translations are run through the model, they all result in "ssssss" (even if a blank line is sent). I can't find anything about this happening with Marian models after tuning and can't seem to figure out the issue. The source and target data files all look fine, & there are no errors in the tuning process. Have you ever seen this happen at all, or have any idea on what it could be? I am really stuck. Thank you!

jorgtied commented 1 year ago

That is weird. Maybe fine-tuning ran for too long on very smallish data sets and the model heavily overfitted to the fine-tuning data set and forgot everything else? Did you see strange perplexity scores during fine-tuning?

hdeval1 commented 1 year ago
[2022-09-01 15:00:07] Allocating memory for Adam-specific shards
[2022-09-01 15:00:07] [memory] Reserving 343 MB, device cpu0
[2022-09-01 15:06:27] Seen 2,467 samples
[2022-09-01 15:06:27] Starting data epoch 2 in logical epoch 2
[2022-09-01 15:12:58] Seen 2,467 samples
[2022-09-01 15:12:58] Starting data epoch 3 in logical epoch 3
[2022-09-01 15:19:30] Seen 2,467 samples
[2022-09-01 15:19:30] Starting data epoch 4 in logical epoch 4
[2022-09-01 15:26:01] Seen 2,467 samples
[2022-09-01 15:26:01] Starting data epoch 5 in logical epoch 5
[2022-09-01 15:32:32] Seen 2,467 samples
[2022-09-01 15:32:32] Starting data epoch 6 in logical epoch 6
[2022-09-01 15:32:32] Training finished
[2022-09-01 15:32:51] Saving model weights and runtime parameters to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz.best-perplexity.npz
[2022-09-01 15:32:51] [valid] Ep. 6 : Up. 150 : perplexity : 700.626 : new best
[2022-09-01 15:32:51] Saving model weights and runtime parameters to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz
[2022-09-01 15:32:52] Saving Adam parameters
[2022-09-01 15:32:54] [training] Saving training checkpoint to /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz and /OPUS-MT-train/work-tatoeba/mul-eng/opus-tuned4afr2eng.spm1k-spm1k.transformer-align.model1.npz.optimizer.npz

It looked like it only went through one round? What is even weirder is the compare file (Tatoeba-test-v2021-08-07.afr-eng.opus-tuned4afr2eng.spm1k-spm1k1.transformer-align.afr.eng) shows the translations as the ssss and blank lines:

sssssssssssssssssssssssssssssssss

sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

sssssssssssssssssssssssssssssssss

sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

sssssssssssssssssssssssssssssssss

sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssss

And then of course the eval file records the bleu score as 0. I double checked all the data...I used about 1500 lines of afr-eng data to finetune mul-eng model). I am really at a loss here because I can tune monolingual models just fine using the same steps. Do you have anymore insight?