dmlc / gluon-nlp

NLP made easy
https://nlp.gluon.ai/
Apache License 2.0
2.55k stars 538 forks source link

[WIP] add character-level tokenizer #1515

Open zheyuye opened 3 years ago

zheyuye commented 3 years ago

Description

Add character-level tokenizer with test.

Comments

Although this is a work-in-process PR, it is still open to review and please leave comments on code quality and API design.

DO NOT merge till it's ready.

cc @sxjscience