Closed mattiadg closed 7 years ago
Hi,
Are you using Python 2.7? There were some issues with divisions. Could you retry with the latest version?
Now I'm getting positive accuracy values for accuracy during training, but still I couldn't perform a decoding. I'll give you more info soon.
This might help: https://github.com/OpenNMT/OpenNMT-py/pull/22, but it will only impact unk replacement. If you weren't getting anywhere near good translations, we might need more info since I'm using IWSLT de-en and getting good results.
Without #22 I get the following error. I'm using python2.7 and the flag -replace_unk
Traceback (most recent call last):
File "/hltsrv0/digangi/OpenNMT-py/translate.py", line 135, in <module>
main()
File "/hltsrv0/digangi/OpenNMT-py/translate.py", line 89, in main
predBatch, predScore, goldScore = translator.translate(srcBatch, tgtBatch)
File "/hltsrv0/digangi/OpenNMT-py/onmt/Translator.py", line 204, in translate
for n in range(self.opt.n_best)]
File "/hltsrv0/digangi/OpenNMT-py/onmt/Translator.py", line 62, in buildTargetTokens
tokens[i] = src[maxIndex[0]]
IndexError: list index out of range
While it can translate without -replace_unk
Yeah, that's what I was seeing too. Translate with -replace_unk would only have run on datasets without unks or when using batch_size 1. Should be fixed now.
Regarding your poor decoding performance, what validation accuracy/perplexity are you getting down to? For the demo data, I don't think you'll ever get good translations. That was part of the motivation for adding Multi30k. I think eventually, we'll just remove the demo data. Multi30k and IWSLT should work pretty well though.
The poor decoding I was talking about was before the fix, when it was totally random. Now I get reasonable translations, but I trained only on the TED talks, so the bleu score isn't so high. Today I try to decide with your last fix, and if it works I don't have anything to add to this thread
Ok, I've downloaded the last version and now I don't get anymore that error. What I get is an out-of-memory error (I'm using a K80):
THCudaCheck FAIL file=/data/users/soumith/builder/wheel/pytorch-src/torch/lib/THC/generic/THCStorage.c line=79 error=2 : out of memory
And this is the command
python ~/OpenNMT-py/translate.py -gpu 0 -model models/model_*_e20.pt -src data/dev.en.lc.txt -tgt data/dev.fr.lc.txt -verbose -output demo_pred.txt -beam_size 5 -batch_size 20 -replace_unk
While without -replace_unk it is able to translate with beam_size 10 and batch_size 40
Maybe is it better not to use Python2.7?
That seems odd. replace_unk is only relevant here, which doesn't seem like it should take up much memory. I looked around and realized the Variables for data could be made volatile though, so that should help some (6b4cb9d60eb662a736b09c69457375939cff5dc6). Let me know if you still see issues.
Hi,
I'm running the last updated OpenNMT-py with the latest pytorch, cuda7.5 on a GPU K80. I ran the training with both the data provided in the example and with the en-fr from IWSLT2016. In both cases, the perplexity during training gets very low (single digit for every minibatch) but the accuracy is always 0.0. Moreover, when I try to translate the validation sets, the translations seems totally random.