Closed kateryna-bud closed 2 years ago
Hey @kateryna-bud,
Marian models were not really trained on inputs such as [" ","-"]
- so this data can be considered as strong out of distribution data which will have unpredictable outputs.
Why would you need translate a single empty space? :-)
Hi @patrickvonplaten,
thanks for your answer. I have another nlp preprocessing steps. In some cases empty sentences are produced. I remove those now, but I wonder why the model returns "I don't know". I thought that this is maybe a default setting if the output is unpredictable. In this case I would like to adjust it.
What about the "-"? For other characters, the prediction is the same as the input, but not for the minus.
Thanks, Kateryna
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Hi @patrickvonplaten
The translation model for the input word 'hec' returns
['Hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey, hey.']
How to handle those "hickups".
Hey @kateryna-bud - I don't think "hec
" is a valid words and I also don't know what you would expect to be the translation here. In general translation models are by no means perfect and can have unexpected behavior. You could try to apply some of the methods as described here: https://huggingface.co/blog/how-to-generate
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Did anyone find a resolution to this?
Environment info
transformers
version: 4.12.2Who can help
@patrickvonplaten
Information
Model I am using (MarianMT):
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Output:
["I don't know.", '- No, no, no, no, no, no, no.']
Expected behavior
Return an empty sentence or the character, since there is nothing to translate
[" ", "-"]
Thanks in advance!
Cheers, Kateryna