Some of the hyperparameters needed to be changed because of the PyTorch 0.2 port and discovery of the dropout bug. The new hyperparameters should reproduce (close to, or even better) the perplexity numbers on PTB and WT2 from the paper.
LGTM. The Salesforce CLA bot was due to an earlier commit which didn't tie to @keskarnitish and has been fixed. Glad to get the hyper params out there for others :)
Some of the hyperparameters needed to be changed because of the PyTorch 0.2 port and discovery of the dropout bug. The new hyperparameters should reproduce (close to, or even better) the perplexity numbers on PTB and WT2 from the paper.