Closed adjouama closed 4 years ago
Could you provide more details what does it mean that the model does not translate long sentences? There is a segfault or no output produced? Is it for a single sentence or do you translate a file?
The marian-decoder
has the --max-length
option, which by default is 1000. I guess, your sentences are not that long after subword segmentation?
Thank you for the quick answer and sorry for the lack of information. Basically, the output stays the same as the input. Sometimes, it translates few words (see example below of the output);
The World Telecommunication/ICT indicators database on USB Key and online contains time series data for the years 1960, 1965, 1970 and year from 1975 to 2018 for more than 180 telecommunication/ICT statistics covering fixed-phone networks, mobile-cellular telephone subscriber, quality of service, Internet (including fixed- and mobile broadband subscriber data), traffic, staff, price, investment, investment and investment on ICT access and use by home and persons. Sred population,宏 economic and broadcasting statistics are also included. Data for over 200 economic are available.
I send my long paragraph to marian-server as a one sentence.
Is that related to --max-length
parameter in the training process ?
Note that the short sentences gets translated perfectly with a very good quality.
Thank you in advance,
Here is another example:
Input:
Roborace, a sports media business developing completely new forms of autonomous motorsport, also established ADA (Autonomous Drivers Alliance), a non-profit association focused on contributing to global action in the interests of road safety – the initial outlines of which were shared with the UK CAV industry during Goodwood.
Output:
Roborace, a sports media business developing complete new form of 自主机动车, also established ADA (Autonomous Drivers Alliance), a其非营利协会主要集中在为道路安全做出贡献的全球性行动方面 – the initial outline of which were shared with the UK CAV industry during goodwood.
Issue solved after playing around vocabulary size. Thanks
I produced a model English-Chinese. However,during decoding, it does not translate long sentences. Below is my valid.log and my configuration. Thank you a lot in advance.
Valid.log:
Configuration: