shrimai / Style-Transfer-Through-Back-Translation

162 stars 32 forks source link

Translation with style, dimension 0 out of range of 0D tensor #14

Closed Jaxonwht closed 6 years ago

Jaxonwht commented 6 years ago

Hi,

I ran into an issue in the final translation with style step. I added "-with_style" flag to indicate onmt.Translator_style is used instead of onmt.Translator. Encoder models and decoder models were both downloaded from the websites mentioned in Readme. Currently running python 3.6.6, pytorch 0.3.0. Translation of texts from English to French, proprocessing and train_decoder all worked fine.

[haotian@jaxon style_decoder]$ python translate.py -with_style -encoder_model ../models/translation/french_english/french_english.pt -decoder_model ../models/style_generators/democratic_generator.pt -src ../data/political_data/test.fr -output trained_models/republican_democratic.txt -replace_unk $true
Traceback (most recent call last):
  File "translate.py", line 142, in <module>
    main()
  File "translate.py", line 96, in main
    predBatch, predScore, goldScore = translator.translate(srcBatch, tgtBatch)
  File "/home/haotian/Desktop/Polarization Lab/Style-Transfer-Through-Back-Translation/style_decoder/onmt/Translator_style.py", line 197, in translate
    src, tgt, indices = dataset[0]
  File "/home/haotian/Desktop/Polarization Lab/Style-Transfer-Through-Back-Translation/style_decoder/onmt/Dataset.py", line 45, in __getitem__
    align_right=False, include_lengths=True)
  File "/home/haotian/Desktop/Polarization Lab/Style-Transfer-Through-Back-Translation/style_decoder/onmt/Dataset.py", line 28, in _batchify
    lengths = [x.size(0) for x in data]
  File "/home/haotian/Desktop/Polarization Lab/Style-Transfer-Through-Back-Translation/style_decoder/onmt/Dataset.py", line 28, in <listcomp>
    lengths = [x.size(0) for x in data]
RuntimeError: invalid argument 2: dimension 0 out of range of 0D tensor at /opt/conda/conda-bld/pytorch_1512386481460/work/torch/lib/TH/generic/THTensor.c:24
Jaxonwht commented 6 years ago

I figured this is due to the fact that there is an empty line in the intermediate French file.

chejmadi commented 5 years ago

Hi, we're getting the same issue, but I'm not sure whether that's due to an empty line, since the error is occurring in the middle of the data file being translated. What's more, when we ran the code with a reduced file of 10,000 lines, it finished the whole thing without an error. And even that file ended with an empty line. How had you resolved your issue?

Jaxonwht commented 5 years ago

Hi, we're getting the same issue, but I'm not sure whether that's due to an empty line, since the error is occurring in the middle of the data file being translated. What's more, when we ran the code with a reduced file of 10,000 lines, it finished the whole thing without an error. And even that file ended with an empty line. How had you resolved your issue?

There was an empty line in the intermediate French text, not the original English text. You may want to check that instead.

chejmadi commented 5 years ago

So, yes, I do mean the intermediate French text. There was no blank line there. It translated 27,180 lines of the republican set and gave this error. For the democratic French set, it did around 11,000 lines. There was no blank line here, I've checked multiple times. :/

Jaxonwht commented 5 years ago

So, yes, I do mean the intermediate French text. There was no blank line there. It translated 27,180 lines of the republican set and gave this error. For the democratic French set, it did around 11,000 lines. There was no blank line here, I've checked multiple times. :/

you can print the line that gives you the error and see which line it is exactly. You may want to change this list comprehension to a for loop and add print statements in the loop lengths = [x.size(0) for x in data].