Closed happyMinionVic closed 5 years ago
By the way, the error occurs in the start of training rather than running preprocess.py
Could you provide the other configs when you run the preprocess.py?
Here is the config, the only difference is -knl_seq_length_trunc 100
:
python preprocess.py \ --train_src data/src-train-tokenized.txt \ --valid_src data/src-valid-tokenized.txt \ --train_knl data/knl-train-tokenized.txt \ --valid_knl data/knl-valid-tokenized.txt \ --train_tgt data/tgt-train-tokenized.txt \ --valid_tgt data/tgt-valid-tokenized.txt \ --save_data data/cmu_movie \ -dynamic_dict \ -share_vocab \ -src_seq_length_trunc 50 \ -tgt_seq_length_trunc 50 \ -knl_seq_length_trunc 100 \ -src_seq_length 150 \ -knl_seq_length 800
You can try -knl_seq_length 400 and modify the code in onmt/encoders/mtransformer.py. Note that we use three previous dialogs as context so that the knl_seq_length is quadruple of knl_seq_length_trunc.
Ok, I will try this right now. And is there any other restrictions on the sequence length in the preprocessing? Such as -src_seq_length
is three times as -src_seq_length_trunc
?
Yes. You can modify the code in onmt/encoders/mtransformer.py
to change this setting.
Yes
Where is the knl[600:800 ]? Are they just ignored?
No, knl[600:800] is used in the deocding process.
I had modified all the related parameters in mtransformer.py
and still got the the error Floating point exception
My pytorch is 1.0.0 and torchtext is 0.4.0.
Where is this bug from ?
Oh, sorry, you should also modify the parameters in onmt/models/model.py
.
I had run the code successfully with the given config. But when I tried to reduce the knowledge sentence padding length in the preprocess by reducing the padding length of knl from 200 to 100, I got an
Floating point exception (core dumped)
error and no other traceback. Could you figure out how could this happed and be fixed ? Thanks.