codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation
Apache License 2.0
6.19k stars 1.3k forks source link

Making Wikipedia Corpus #42

Open codertimo opened 5 years ago

codertimo commented 5 years ago

Building the same corpus with original paper. Please share your tips to preprocess and download the file. It would be great to share preprocessed data using dropbox or google drive etc.

codertimo commented 5 years ago

32