google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.
Apache License 2.0
10.22k stars 1.17k forks source link

How to change newword character? #198

Closed desh2608 closed 6 years ago

desh2608 commented 6 years ago

Hi, I am using the Python pip package and have a small question. Can we use a different character, say '|', instead of '_' for denoting the start of a new word? If yes, how?

taku910 commented 6 years ago

No, the special symbol for the space is hard coded.