Closed LinMu7177 closed 4 years ago
Hi, I did not try it. Just think that revising the internal embedding layer might influence the parameter structure of the pre-trained BERT. Anyway, it is a good attempt when having time.
Okay, thank you for your reply!!!
First of all, thank you for being open source! It isn't clear to me why the info of SRL is not used as additional embedding layers but is rather aggregated separately. Have you tried it before? Hope you can share your opinion, thank you very much!