Issues about training a model on Chinese corpus

Hi, very much thanks for sharing your implementation. I tried to apply this model to the Chinese corpus without labels. I reused the default parameter settings but only decreased the batch size to be 8 because the "out of memory" problem occurred when the batch size is larger. But the generation turned out to be poor after training for 20 epochs. By the way, I pretrained the language model with the "pretrained_lm" code. Is there anything I need to pay special attention to for training? such as the number of documents? the length of each document? the size of vocabulary? Looking forward to your reply.

sosuperic / MeanSum

Issues about training a model on Chinese corpus #14