ZhenYangIACAS / NMT_GAN

generative adversarial nets for neural machine translation
Apache License 2.0
119 stars 37 forks source link

main process and puzzles about config_generator_train #4

Closed jeicy07 closed 6 years ago

jeicy07 commented 6 years ago
  1. Since all of my train and evaluation data are positive, is that the right sequence to run files:vocab.py -> train.py -> generate_samples.py -> discriminator_pretrain.py -> gan_train.py -> evaluate.py?

2.I'm a little confused about the function of "evaluation.py". what's the meaning of these keys in config_generator_train: "test: src_path: '/data3/jeicy/data/eval/eval/eval_jieba.zh' dst_path: '/data3/jeicy/data/eval/eval/eval_jieba.zh' ori_dst_path: '/data/zhyang/dl4mt/corpus/data_450w_en_de/transformer/lf_50_for_gan_train/bleuTest/newstest2013/newstest2013.tok.de' output_path: "

ZhenYangIACAS commented 6 years ago

Yes, your main process is right. I don't find the test part in our code like what you said. Which file do you find that? The test part in config_generator_train.yaml is like this: src_path: '/data/zhyang/dl4mt/corpus/data_450w_en_de/transformer/lf_50_for_gan_train/bleuTest/newstest2013/newstest2013.tok.bpe.32000.en' dst_path: '/data/zhyang/dl4mt/corpus/data_450w_en_de/transformer/lf_50_for_gan_train/bleuTest/newstest2013/newstest2013.tok.bpe.32000.de' ori_dst_path: '/data/zhyang/dl4mt/corpus/data_450w_en_de/transformer/lf_50_for_gan_train/bleuTest/newstest2013/newstest2013.tok.de' output_path: '/data/zhyang/dl4mt/corpus/data_450w_en_de/transformer/lf_50_for_gan_train/bleuTest/newstest2013/newstest2013.tok.de.output' batch_size: 256 max_target_length: 200 beam_size: 4 lp_alpha: 0.6 devices: '0,1'

jeicy07 commented 6 years ago

yep, I mean I'm puzzled with the meanings of these keys: src_path, dst_path, ori_dst_path and output path. What does each of them mean, thanks?

ZhenYangIACAS commented 6 years ago

@jeicy07 src_path: the path for source file; dst_path: the path for the reference file which has been bpe ori_dst_path: the path for the reference file which has not been bpe output_path: the path to store the translated output

jeicy07 commented 6 years ago

sorry, what do you mean by "has been bpe"?

ZhenYangIACAS commented 6 years ago

I am sorry for my casual expression. BPE means the word has been represented by byte-pair encoding, which is usually used in NMT.

jeicy07 commented 6 years ago

Thanks a lot!

ZhenYangIACAS commented 6 years ago

That is all right. Bests!