Open adabghi opened 5 years ago
You cannot expect a model trained on single sentences will translate multi-sentence inputs (or even single-sentence but longer than was the training-time maximum). You need to split the document into sentences before translation.
Alternatively, you can try document-level training, which is a hot research topic (see papers+submissions to the upcoming WMT2019 shared task which focused on doc-level translation).
@martinpopel Thanks for the response ! That's what I thought also, but the thing that made me open this issue is that I'm able to translate a whole document (text) when using the t2t-deocder command (It skips some sentences though). However, when I use the Python script from this Tensor2Tensor Intro, it gives me back non-sence translation.
Is there a possible explanation for this ?
Description
I trained a transformer model for English to French translation. It is working well when I give it a sentence to translate. However, when I give a whole document (or simply a paragraph with many sentences), it gives me back a very bad translation and sometimes it skips some sentences.
Did anyone encounter this kind of problem ?