Closed SwordYork closed 7 years ago
Sure. RNN-based language modeling on PTB usually treats the entire document as one long sentence. Hence, if the sentences are ordered (as is the case with Mikolov's version of the data), then it can use information from previous sentences in a meaningful way.
I see. Thank you very much!
Hi,
I have found that the model performs much worse when trained using the shuffled data (PTB). For example, the final PPL of the large word-level model is 97.79. Do you have any idea?
Thanks!