what is the synthetic self-training

google-research / bert

TensorFlow code and pre-trained models for BERT

https://arxiv.org/abs/1810.04805

Apache License 2.0

37.56k stars 9.54k forks source link

what is the synthetic self-training #488

Open lxylxyoo opened 5 years ago

lxylxyoo commented 5 years ago

I noticed that the best model for squad2.0 uses bert + synthetic self-training trick, but i can't find any description about this. Could anyone show me how this trick work? Thanks!

Vibha111094 commented 5 years ago

I have also been wondering about the same,I am not able to find any further details about these methods can be used along with BERT . Any details on this will be very helpful.

ZhuoranLyu commented 5 years ago

same question

f-dx commented 5 years ago

Here is the answer, from page 26

https://nlp.stanford.edu/seminar/details/jdevlin.pdf?fbclid=IwAR2TBFCJOeZ9cGhxB-z5cJJ17vHN4W25oWsjI8NqJoTEmlYIYEKG7oh4tlY

lxylxyoo commented 5 years ago

Here is the answer, from page 26

https://nlp.stanford.edu/seminar/details/jdevlin.pdf?fbclid=IwAR2TBFCJOeZ9cGhxB-z5cJJ17vHN4W25oWsjI8NqJoTEmlYIYEKG7oh4tlY

Thank you so much for your share !!!

ecchochan commented 5 years ago

So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol

lxylxyoo commented 5 years ago

So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol

Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking. https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE

Dogy06 commented 5 years ago

Is synthetic self training in this repo?

hsm207 commented 5 years ago

So how about N-Gram Masking? Does anyone know? Even Single Model beats the others lol

Baidu recently open sourced a model called ERNIE, which masked n-gram instead of 1-gram during pre-training. I think this is similiar to the bert + N-Gram Masking. https://github.com/PaddlePaddle/LARK/tree/develop/ERNIE

What is N-Gram Masking? Anyone care to translate the readme in Baidu's repo to English?