Closed bzhangGo closed 4 years ago
Hi, the unsupervised NMT results are based on a slight modification of a separate codebase, XLM, for a fair comparison with the SOTA systems. We basically follow their instructions for hyperparameters and data collection, and add a language model based KL loss on that. Unfortunately our modified code is on a cluster which I don't have access to now.
thanks.
Hi, could you please show more details and instructions on how to reproduce the WMT16 En-De translation results (En-De 26.9, De-En 32.0) with only 5M non-parallel sentences using this codebase? Thanks.