yunsukim86 / wbw-lm

Context-aware beam search for unsupervised word-by-word translation
Other
9 stars 1 forks source link

Language Model Question #1

Closed BinWang28 closed 5 years ago

BinWang28 commented 5 years ago

Hi Yunsu,

Thanks for your excellent work. I am trying to repeat your result in the paper. I have a question about the Language Model.

For training with kelnm, what corpus is used? Same with the corpus used in monolingual word embedding?

Thanks so much!

yunsukim86 commented 5 years ago

Hi Bin,

yes, we did use the same data for learning word embedding and language model: 100M sentences sampled from News Crawl 2014-2017 monolingual corpora.

Best regards, Yunsu

BinWang28 commented 5 years ago

Many thank!