Closed z744364418p closed 6 years ago
tokenize() have been changed "split()",but no work,can you help me? think you
how to change alphabet for chinese?
You should remove or change this line in clean_str() because it will remove all chinese characters. You need to define alphabet to use character-level models(char_cnn, vd_cnn). I don't know much about chinese so I'm not sure how to define alphabets in chinese. koalaGreener/Character-level-Convolutional-Network-for-Text-Classification-Applied-to-Chinese-Corpus might help you.
For different language, you should change clean_str(), word_tokenize(), and alphabet