fastnlp / TENER

Codes for "TENER: Adapting Transformer Encoder for Named Entity Recognition"
370 stars 55 forks source link

TENER vs BERT #17

Closed tangzhy closed 4 years ago

tangzhy commented 4 years ago

Hi, have you compared adapted transformer with bert, where pre-trained knowledge might make up for the drawback of vanilla transformer?

yhcc commented 4 years ago

Yes, when pre-trained with various text, the vanilla Transformer can also have very good sense of direction and distance. And this has been verified by many papers which investigate the effectiveness of BERT. Our proposal wants to discuss why the vanilla transformer cannot do well in the NER task, and based on the discussion we make some improvements, and luckily it worked. But this is by no means the only way to solve this problem. And BERT can definitely achieve better performance than TENER.