Closed at15 closed 8 years ago
HanLP already provided a module for tokenize, but it use standard tokenizer for keyword extractor
it's ok to store the index in a whole file now, time to split it and query against it.
e... have to say .... use json make the index file really big .... 3mb -> 70mb
;,;;
directlytokenize result