codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.11k stars 1.29k forks source link

Why output_label=0 in datasets generation for Masked LM #36

Closed leon-cas closed 5 years ago

leon-cas commented 5 years ago

In dataset.py, function 'random_word', line90, why the output_label of 85% data(no masking) is set to 0 , output_label.append(0)

codertimo commented 5 years ago

@leon-cas Cause no masking shouldn't be trained by optimizer, and 85% rate is came from the paper. We masked the 0 value which can't be trained through backprobagated

leon-cas commented 5 years ago

So you mean only 15% of all data are used to train MLM?

codertimo commented 5 years ago

Yes and it's noticed on the paper too.

jiqiujia commented 5 years ago

It means each word in a sentence is masked out with 15% probability and MLM is trained to predict the masked words. Please read the paper carefully.

leon-cas commented 5 years ago

@jiqiujia @codertimo thanks, guys.