Closed haorannlp closed 4 years ago
@haorannlp hi, the version of OR-RNN model may not be tested well. So there may be some problems. I think the reason why the grad became NaN
may be that some numbers are divided by zero or the sqrt operation on the negative number, so you should add eplison to some numbers with division
and sqrt
operations.
@zhang-wen Thanks bro. I will check the code later.
Besides, the translation task code in Fairseq (translation.py line 43; indexed_dataset.py line 102, 106
) seems to need .idx
and .bin
files while I name my data files as train.BPE.en, train.BPE.de/ newstest2013.en-de.en, newstest2013.en-de.de/ newstest2014.en-de.en, newstest2014.en-de.de
. Would you mind telling me how to name train/val/test data files when running OR-Transformer and do we need any modifications in the training command? Thanks.
@haorannlp Yes, firstly you need to generate the directory of data_bin
by running the preprecess.py in fairseq. Please refer to the generation scripts of wmt16_en_de_bpe32k
in the link: https://github.com/ictnlp/awesome-transformer. After that, you can run the training command by using python train.py $data_dir
, just like our README.md
in https://github.com/ictnlp/OR-NMT. Feel free to ask any questions, thanks.
@zhang-wen Thanks, it worked!
BTW, why is it WMT'16 instead of WMT'14 as mentioned in the paper? When people talk about WMT' 14, are they referring to WMT' 14 Europarl-v7? I'm a little confused since there are several datasets in WMT'XX.
The last thing is the training command in README.md
uses Transformer big
instead of` Transformer base
as the original paper does. I guess this is a typo.
@haorannlp Hi, the WMT'14 ende training set we used was obtained by the shell script https://github.com/pytorch/fairseq/blob/master/examples/translation/prepare-wmt14en2de.sh provided by Fairseq.
Hi Zhang wen,
After I ran the OR-RNN model for 20 hours with default parameters in
wargs.py
(I only changed the data directory), the grad becameNaN
. Do you have any ideas? ThanksMy Configurations:
python 2.7
torch 1.0.1