dhlee347 / pytorchic-bert

Pytorch Implementation of Google BERT
Apache License 2.0
591 stars 179 forks source link

pretrain for chinese text #8

Closed Jason-kid closed 5 years ago

Jason-kid commented 5 years ago

Hi, i want to pretrain the code for chinese data as datafile . The formate is like this: 今天 天气 好 and can i use the my own vocab.txt ? thanks a lot.

dhlee347 commented 5 years ago

Maybe, but not sure about Chinese data. Good luck, and plz share your success story if successful !