Open lxylxyoo opened 5 years ago
I have also been wondering about the same,I am not able to find any further details about these methods can be used along with BERT . Any details on this will be very helpful.
same question
Here is the answer, from page 26
Here is the answer, from page 26
Thank you so much for your share !!!
So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol
So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol
Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking. https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE
Is synthetic self training in this repo?
So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol
Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking. https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE
What is N-Gram Masking? Anyone care to translate the readme in Baidu's repo to English?
I noticed that the best model for squad2.0 uses bert + synthetic self-training trick, but i can't find any description about this. Could anyone show me how this trick work? Thanks!