Closed Victoriaheiheihei closed 1 year ago
hello, impressing work. I'm confused that have you initialized the model with other model checkpoints in the wikitext-103 experiment reported in the paper?
I don't quite understand what you mean? All the models in the paper were trained from scratch...
hello, impressing work. I'm confused that have you initialized the model with other model checkpoints in the wikitext-103 experiment reported in the paper?