google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.94k stars 9.57k forks source link

dose <s> represent whitespace in the chinese pretrained vocabulary? #385

Open lorashen opened 5 years ago

lorashen commented 5 years ago

Because the chinese pretrained vocab does not include all the english words, so I split english words into characters. Then how do I represent whitespace between english words?

lan2720 commented 4 years ago

+1. and does mean tab?