Open jm-glowienke opened 3 years ago
the original input is used to replace the
in the translation.
Hi @jm-glowienke may I ask if you fixed the issue, it seems that it is still not working for the task of the original input is used to replace the <unk> in the translation.
Hi @jm-glowienke I would also like to know if there is any solution to this issue
Hi @jm-glowienke I would also like to know if there is any solution to this issue
Are you also applying for transformer model?
This blog explained a bit about why their -replace-unk
is not working for the transformer model. https://forum.opennmt.net/t/translate-py-with-replace-unk-option-and-the-transformer-model/2646
might be helpful somehow
[Update on Dec 04, 2022] My task was doing spelling correction, and I was trying to skip all the special characters to unk. I used an alternative way to achieve that:
In this commit, they tried to add -replace-unk
feature, but not sure if we have to go back to that version https://github.com/facebookresearch/fairseq/commit/4815ed4d5e2b50fe85573a045fdf486ff8e64a58
Hi, I found a solution for the problems described in the issue. They can be found on my personal fork of fairseq
: https://github.com/jm-glowienke/fairseq
Unfortunately, I cannot help you any further, as I only worked on this for my thesis almost 2 years ago.
🐛 Bug
When using farseq-interactive to generate translations, the
--replace-unk
argument causes several bugs.<unk>
in your translation. So I think, it would be good that in this case the original input is used to replace the<unk>
in the translation.To Reproduce
Steps to reproduce the behavior (always include the command you ran):
input text:
legal name of allianz
Allianz is an OOV word for my task.For 1.
For 2.:
Expected behavior
Replace the
<unk>
in the hypothesis by the corresponding word in the input according to the alignments. This should also be possible without an alignment dictionary.I made a fix for 1., found a workaround for 2. and added some code to include feature described in 3.
I can provide a PR, if wished
Environment
pip
, source):CFLAGS="-stdlib=libc++" pip install --editable ./
Additional context