Set dynamic max length per batch

This introduces a max length per batch at inference. By default the max_length was set at 250 tokens.

It is now Min(max_length, batch_max_len * max_length_ratio) + 5 Let say the max_length_ratio is set at 1.25 (large enough for most European languages)

If a batch has examples from length 5 to 10 the max_length will be = 10 x 1.25 + 5 = 17

The "5" offset prevents from stopping to early for very short sequences and the max_length_ratio enable to stop before the max_length (which might be too high) and prevent from hallucinations / repetitions.

If you need to disable the feature, set it to zero.

OpenNMT / OpenNMT-py

Set dynamic max length per batch #2523