Lynten / stanford-corenlp

Python wrapper for Stanford CoreNLP.
MIT License
922 stars 200 forks source link

NER for German words #107

Open lindeer opened 1 year ago

lindeer commented 1 year ago

I had large text about history, there were so many names and historical terms. corenlp could recognize single word Wilhelm and Holland, but could not recognize Wilhelm von Holland as a whole. How could I make it, or is there something wrong with my usage?

jars: jars/stanford-corenlp-4.5.3.jar jars/stanford-corenlp-4.5.4-models-german.jar

command line : java -mx3g -cp "jars/*" edu.stanford.nlp.pipeline.StanfordCoreNLP -props StanfordCoreNLP-german.properties -annotators tokenize,ner -file section1.md -outputFormat conll -output.columns word,ner

output:

13  Wilhelms    _   _   PERSON  _   _
14  von _   _   O   _   _
15  Holland _   _   LOCATION    _   _