huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.62k stars 27.15k forks source link

is right? #2858

Closed ARDUJS closed 4 years ago

ARDUJS commented 4 years ago

https://github.com/huggingface/transformers/blob/master/examples/run_language_modeling.py in 225 row

image

10% of the time, we replace masked input tokens with random word

but write 0.5 is ok?

julien-c commented 4 years ago

Hi @ARDUJS can you update your issue title to something more descriptive? Thanks!

stefan-it commented 4 years ago

Should be correct -> 80% masked, that means 20% is left. Using this 20% in 50 % the random word is used, 50% original token is kept. So both random word and original has an overall prob. of 10%.

Original BERT is using the same logic, see here.