CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
GNU General Public License v3.0
9.71k
stars
2.7k
forks
source link
I use this command, but the word-cut results are same to space-split. Thank you very much. #1449
Open
guotong1988 opened 5 months ago
java -cp "stanford-corenlp-4.5.6/*" edu.stanford.nlp.international.arabic.process.ArabicTokenizer useUTF8Ellipsis,normArDigits,normArPunc,normAlif,normYa,removeDiacritics,removeTatweel,removeQuranChars,removeProMarker,removeSegMarker,removeMorphMarker,removeLengthening,atbEscaping < input.txt
Thank you very much @AngledLuffa @J38 @gangeli @rayder441 @angelxuanchang