benywon / ReCO

ReCO: A Large Scale Chinese Reading Comprehension Dataset on Opinion
33 stars 9 forks source link

请问训练albert_xxlarge时超参如何设置 #3

Open bestbzw opened 3 years ago

bestbzw commented 3 years ago

您好,我在ReCo上用xx_large训练的时候发现模型的loss一直不下降,请问您的超参数是如何设置的?是否加了warmup,dropout等策略?

benywon commented 3 years ago

使用LAMB优化器,lr设置小一点