stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.71k stars 2.7k forks source link

I use this command, but the word-cut results are same to space-split. Thank you very much. #1449

Open guotong1988 opened 5 months ago

guotong1988 commented 5 months ago

java -cp "stanford-corenlp-4.5.6/*" edu.stanford.nlp.international.arabic.process.ArabicTokenizer useUTF8Ellipsis,normArDigits,normArPunc,normAlif,normYa,removeDiacritics,removeTatweel,removeQuranChars,removeProMarker,removeSegMarker,removeMorphMarker,removeLengthening,atbEscaping < input.txt

Thank you very much @AngledLuffa @J38 @gangeli @rayder441 @angelxuanchang