google-research / albert

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Apache License 2.0
3.23k stars 570 forks source link

Training without SOP Loss #202

Open divkakwani opened 4 years ago

divkakwani commented 4 years ago

Hi,

I'm trying to train ALBERT on news article crawls which has been filtered aggressively (a lot of sentences might have been skipped) and there is also no separation of the articles in the final corpus. In this case, do you think removing SOP loss would be a better idea? If yes, how do I go about doing this with your code?