yandex / faster-rnnlm

Faster Recurrent Neural Network Language Modeling Toolkit with Noise Contrastive Estimation and Hierarchical Softmax
Other
561 stars 138 forks source link

Random text generator from original RNNLM; CRnnLM::testGen() #6

Closed scottlingran closed 9 years ago

scottlingran commented 9 years ago

Any chance you're re-implementing the original CRnnLM::testGen() method?

e.g. https://github.com/katakombi/rnnlm/blob/master/rnnlmlib.cpp#L2361-L2497

akhti commented 9 years ago

Well, there is a HSTree::SampleWord method that allows you to sample words from a model trained with HS. On the other hand, there is no way to sample words from a model trained with NCE efficiently. Besides faster-rnnlm and crnnlm are trained to be used in rescroring setting. That is, all words in the history are expected to be real ones. While during sampling, some words in the history could be totally inaccurate. To wrestle this bias, a sampling trick should be used (see http://arxiv.org/abs/1506.03099 ). That's the main reasons why generate mode is missing. But maybe a will add it in a while.

akhti commented 9 years ago

Added basic sampling support in 038a4ebe . Usage example:

echo the meaning of life is | ./rnnlm --rnnlm <modelname> --generate-samples 10

scottlingran commented 9 years ago

Amazing! Here's what I got :)

the meaning of life is | training

I'll play with HSTree::SampleWord too, thanks so much.