microsoft / DialoGPT

Large-scale pretraining for dialogue
MIT License
2.35k stars 341 forks source link

how to choose the hyperparameter when pre-training #18

Open g-jing opened 4 years ago

g-jing commented 4 years ago

It is really great work. I wonder if you could share the hyperparameter that is used to pre-train the DialoGPT, especially the hyperparameters for GPT-small

intersun commented 4 years ago

Which parameter are you referring to?