THUNLP-MT / THUMT

An open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
BSD 3-Clause "New" or "Revised" License
703 stars 197 forks source link

关于BPE的问题 #35

Closed rgwt123 closed 6 years ago

rgwt123 commented 6 years ago

1.如果不使用BPE,会对结果产生影响吗?如果有影响,bleu会降低很多吗? 2.我在tips for transformer这篇文章中看到batch_size会对结果有很大影响,请问你遇到过相关的问题吗,batch_size至少设置多少呢?(默认的batch_size在1080上显存不够用)

XiaoqingNLP commented 6 years ago

@rgwt123 1.不使用BPE,BLEU会下降。 2.尽可能的将batch_size 开大,如果不够用,将update_cycle设置大一点。实际训练中的batch_size=batch_size MultiGPU update_cycle

rgwt123 commented 6 years ago

@zxqchat 感谢回答。请问这里面用的BPE和tensor2tensor中内置的Subword有什么区别吗?不是都是用来解决OOV问题的吗?

XiaoqingNLP commented 6 years ago

@rgwt123 I have not check the different between thumt and tensor2tensor bpe scripts, the principle of both should be same ,you can check it by yourself.