different batch_size lead to different results

nyu-dl / dl4mt-nonauto

BSD 3-Clause "New" or "Revised" License

120 stars 17 forks source link

different batch_size lead to different results #9

Open JackHorse opened 5 years ago

JackHorse commented 5 years ago

Hi, I have been reproducing your results on IWSLT-16 En-De experiments using the NAT pre-trained models. However, I get different result when I use different batch_size.

When batch_size = 1:

微信截图_20191011163447

But when batch_size = 1600:

微信截图_20191011162118

Can you tell me why ?

jaseleephd commented 5 years ago

Hmm that is weird. Can you post the exact decoding script you used? If you're using length prediction, can you try disabling it and run again?

JackHorse commented 5 years ago

when I disabling length prediction, this problem have been solved. But why the bleu score is higher than before ?

The script is :

--batch_size 1 --load_vocab --dataset iwslt-ende --vocab_size 40000 --ffw_block highway --params small --lr_schedule anneal --fast --valid_repeat_dec 20 --use_argmax --next_dec_input both --mode test --remove_repeats --debug --load_from 02.08_20.10.ptrn_model_voc40k_2048_5_278_507_2_drop_0.1_drop_len_pred_0.3_0.0003_anne_anneal_steps_250000_high_tr4_2decs__pred_both_copy_argmax_

jaseleephd commented 5 years ago

Hmm are you using the pretrained length prediction model? @mansimov might know more.

JackHorse commented 5 years ago

Thank you for your reply!

The pretrained model I use is what you released in https://drive.google.com/open?id=1N8tfU5ttnov2jWk3-PHVMJClQA0pKXoN But what is pretrained length prediction model ?

mansimov commented 5 years ago

Thanks for raising an issue!

Are you using the same setup as we did in the paper (pytorch 0.3 or 0.4) ? I will try running the pretrained models as well myself

JackHorse commented 5 years ago

I use the same setup with your requirements (pytorch 0.4, python 3.6.4, torchtext 0.2).