Open datancoffee opened 6 years ago
Hi did you try this with the latest models jars from the GitHub front page?
https://github.com/stanfordnlp/CoreNLP
When I look at the most current models jars we have out, they have the new file paths for the regex rules files.
Make sure not to use the 3.9.1 jars if using the code from GitHub, those are now out of date for the latest code. We are going to release 3.9.2 fairly soon!
When building the jar from GitHub head following the instructions in https://stanfordnlp.github.io/CoreNLP/download.html, the resulting code fails to load the NER models because of the extra "gazetteers" in
public static final String DEFAULT_KBP_REGEXNER_CASELESS = "edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab";
Steps:
ant jar
export CLASSPATH="$CLASSPATH:/pathto/corenlp/javanlp-core.jar:/pathto/corenlp/stanford-corenlp-3.9.1-models.jar:/pathto/corenlp/stanford-corenlp-3.9.1-models-english.jar:/pathto/corenlp/stanford-corenlp-3.9.1-models-english-kbp.jar";
java -mx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -outputFormat json -file input.txt
(actually, I had to increase memory to 5g from 3g - 3g is not enough; you might want to change these instructions as well)