brightmart / bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
960 stars 211 forks source link

pre-trained word embedding的问题 #5

Closed ethereal666 closed 6 years ago

ethereal666 commented 6 years ago

如果用了Tencent_AILab_ChineseEmbedding_100w.txt来做pre-trained word embedding,tf.assign之后不需要tf.stop_gradient吗?不然训练的时候embedding就被改了,难道只是起一下更好的初始化的作用?

brightmart commented 6 years ago

general speaking, people load pretrain word embedding and continue to train word embedding on domain related task. the reason is we need a general representation of words, but we also need the representation fit well on domain related task.

we apply a small learning rate during fine-tuning stage to make sure the adjustment is small and effective.