brightmart / albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
https://arxiv.org/pdf/1909.11942.pdf
3.94k stars 753 forks source link

xxlarge模型微调时训不动 #152

Open bestbzw opened 4 years ago

bestbzw commented 4 years ago

同样的代码,用bert,robert都可以训练,但是用albert_xxlarge时,loss却不下降。请问是要在训练的时候设置什么超参数吗?我加载模型时用的是AutoModel.from_pretrained, 加载tokenizer的时候用的BertTokenizer.from_pretrained.