hyperparameters for pre-training

sinovation / ZEN

A BERT-based Chinese Text Encoder Enhanced by N-gram Representations

Apache License 2.0

642 stars 104 forks source link

Open BinHeRunning opened 4 years ago

BinHeRunning commented 4 years ago

Hi, this is a nice work!

Could you give some more details about the hyperparameters used in pre-training?

ZEN (P) is trained based on Google BERT. How many epochs used in the additional pre-training?

Thanks!