How does ERNIE perform compared with BERT + N-gram masking?

thunlp / ERNIE

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

MIT License

1.41k stars 267 forks source link

How does ERNIE perform compared with BERT + N-gram masking? #6

Closed michael-wzhu closed 5 years ago

michael-wzhu commented 5 years ago

I wonder how your ERNIE performs compared with BERT + N-gram masking? Since the BERT model released by Google does not contain this training procedure, which has shown to be quite useful in SQUAD dataset.

zzy14 commented 5 years ago

Yes, this strategy is very useful for some tasks and Google has updated their repo with the new pre-trained model. You can fine-tune BERT on the dataset.