Closed yayaQAQ closed 3 years ago
We have not experimented with using mT5 for translation. We use a constant learning rate during fine-tuning, but otherwise your plots look fine. Is the one sentence that starts with 39. repeated many times over the course of training, or something? Otherwise, I don't know what the issue is since I don't have access to your data, and this is not a use-case of mT5 that we intend to support.
We have not experimented with using mT5 for translation. We use a constant learning rate during fine-tuning, but otherwise your plots look fine. Is the one sentence that starts with 39. repeated many times over the course of training, or something? Otherwise, I don't know what the issue is since I don't have access to your data, and this is not a use-case of mT5 that we intend to support.
@craffel There is only one sentence beginning with '39'. But generating results are all start with '39'. And what is the value of the constant learning rate? Maybe I can try it on my task. Thanks!
1e-3, see the paper:
@craffel Thanks!
I try to do translation task by mt5, exactly English to Chinese. When I finetuning many steps on millions of parallel corporas, loss was down to 2.407 and stayed for a while. I try use the model after finetuning. loss pic:
My input is English sentence, output is Chinese sentence. Before the generating Chinese sentence , it must generate '39'. I generated 30 sentences. There was 29 sentences started with '39'. Of course, there is no sentence beginning with 39 in the English input. The generating effect is bad.
up is original,down is generated
I find my Chinese corpus, I only find one sentence which start with '39'. What caused this problem? Should I use mt5 to do translation task?There is any good solution for translation task? Thanks!