TENER vs BERT - Githubissues

Yes, when pre-trained with various text, the vanilla Transformer can also have very good sense of direction and distance. And this has been verified by many papers which investigate the effectiveness of BERT. Our proposal wants to discuss why the vanilla transformer cannot do well in the NER task, and based on the discussion we make some improvements, and luckily it worked. But this is by no means the only way to solve this problem. And BERT can definitely achieve better performance than TENER.

fastnlp / TENER

TENER vs BERT #17