stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.57k stars 2.7k forks source link

sentiment annotator: java.lang.NoClassDefFoundError: org/ejml/simple/SimpleBase #347

Closed loretoparisi closed 7 years ago

loretoparisi commented 7 years ago

I have added the sentiment annotator among the others:

annotators="tokenize,ssplit,pos,lemma,ner,sentiment";

I then get an exception

Adding annotator sentiment
{ Error: Error creating class
java.lang.NoClassDefFoundError: org/ejml/simple/SimpleBase
    at edu.stanford.nlp.pipeline.SentimentAnnotator.<init>(SentimentAnnotator.java:52)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.sentiment(AnnotatorImplementations.java:264)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$16.create(AnnotatorFactories.java:446)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:152)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:451)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:154)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:137)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
Caused by: java.lang.ClassNotFoundException: org.ejml.simple.SimpleBase
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 12 more

    at Error (native)
    at new StanfordCoreNLP (/Users/loretoparisi/Documents/Projects/musixmatch-intelligence-node-sdk/lib/nlp/stanford/index.js:104:28)
    at cld.detectCLDP.then.res (/Users/loretoparisi/Documents/Projects/musixmatch-intelligence-node-sdk/lib/tests/nlp/corenlp.js:65:31)
    at process._tickCallback (internal/process/next_tick.js:103:7)
    at Module.runMain (module.js:606:11)
    at run (bootstrap_node.js:394:7)
    at startup (bootstrap_node.js:149:9)
    at bootstrap_node.js:509:3 cause: nodeJava_java_lang_NoClassDefFoundError {} }

According to external docs, it seems it's missing the Java math library EJML, so I have downloaded from the sources here and compiled from scratch using gradle.

I have

java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

I then copied the built jars i.e. the following ones in the CLASSPATH:

core-0.30.jar
dense64-0.30.jar
simple-0.30.jar

I have compiled against the JVM 8. Now my CLASSPATH folder looks like

simple-0.30-javadoc.jar
simple-0.30-sources.jar
simple-0.30.jar
stanford-arabic-corenlp-2016-10-31-models.jar
stanford-chinese-corenlp-2016-10-31-models.jar
stanford-corenlp-3.7.0-models.jar
stanford-corenlp.jar
stanford-english-corenlp-2016-10-31-models.jar
stanford-english-kbp-corenlp-2016-10-31-models.jar
stanford-french-corenlp-2016-10-31-models.jar
stanford-german-corenlp-2016-10-31-models.jar
stanford-spanish-corenlp-2016-10-31-models.jar

When I try to load the annotators I then get a new error

{ Error: Error creating class
edu.stanford.nlp.io.RuntimeIOException: java.io.InvalidClassException: org.ejml.simple.SimpleBase; local class incompatible: stream classdesc serialVersionUID = 7560584869544985034, local class serialVersionUID = -4908174115141247692
    at edu.stanford.nlp.sentiment.SentimentModel.loadSerialized(SentimentModel.java:633)
    at edu.stanford.nlp.pipeline.SentimentAnnotator.<init>(SentimentAnnotator.java:52)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.sentiment(AnnotatorImplementations.java:264)
    at edu.stanford.nlp.pipeline.AnnotatorFactories$16.create(AnnotatorFactories.java:446)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:152)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:451)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:154)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:150)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:137)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
Caused by: java.io.InvalidClassException: org.ejml.simple.SimpleBase; local class incompatible: stream classdesc serialVersionUID = 7560584869544985034, local class serialVersionUID = -4908174115141247692
    at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:616)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
    at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
    at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
...

that seems to be a JVM compiled class file version error (JVM version). The others jars from CoreNLP are ok as I can see from the logs:

LANG ENGLISH en
Adding annotator tokenize
Adding annotator ssplit
Adding annotator pos
Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0,9 sec].
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2,2 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1,5 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0,8 sec].
Adding annotator sentiment

I understand that this could be a bit out of the blue, but since those jars of EJML are required, which version and against which JVM must be compiled?

Thanks.

J38 commented 7 years ago

The necessary jar is at this path: https://github.com/stanfordnlp/CoreNLP/blob/master/lib/ejml-0.23.jar

Also the standard Stanford CoreNLP 3.7.0 distribution folder comes with ejml.

It can be found here: http://stanfordnlp.github.io/CoreNLP/download.html

loretoparisi commented 7 years ago

@J38 Thank. I have added EJML v.23 in the path, but I get a missing class error now:

loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/ejml-0.23.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-arabic-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-chinese-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-corenlp-3.7.0-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-corenlp.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-english-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-english-kbp-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-french-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-german-corenlp-2016-10-31-models.jar
loaded jar:/Users/loretoparisi/Dropbox (musixmatch)/Development/data/data/stanford/stanford-spanish-corenlp-2016-10-31-models.jar
LANG ENGLISH en
Adding annotator tokenize
Adding annotator ssplit
Adding annotator pos
Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0,8 sec].
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2,2 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1,3 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0,7 sec].
Adding annotator sentiment
{ Error: Error creating class
java.lang.NoClassDefFoundError: org/ejml/simple/SimpleBase

You can see the loaded jars at the beginning.

loretoparisi commented 7 years ago

@J38 Which CoreNLP jar contains the ejml library? I'm still having this error:

Caused by: java.io.InvalidClassException: org.ejml.simple.SimpleBase; local class incompatible: stream classdesc serialVersionUID = 7560584869544985034, local class serialVersionUID = -4908174115141247692

I have compiled ejml project (https://github.com/lessthanoptimal/ejml) against JVM 1.8, same version I have built CoreNLP, so I assume it should work. Current version of EJML is 0.30.

So I have copied all built files

dense64-0.30.jar
core-0.30.jar
equation-0.30.jar
simple-0.30.jar

in my CLASSPATH.

If I use the 0.23 as you suggest, it does not work as well.

J38 commented 7 years ago

Could you clarify how you are running your Java code and how you are getting Stanford CoreNLP 3.7.0. You don't need to build ejml to run the sentiment annotator with Stanford CoreNLP. The jar we provide should work fine with Stanford CoreNLP 3.7.0.

loretoparisi commented 7 years ago

@J38 I downloaded from the sources, then

cd classes ; jar -cf ../stanford-corenlp.jar edu
export CLASSPATH=/Users/loretoparisi/Documents/CoreNLPDataset/*:/Users/loretoparisi/Documents/Projects/CoreNLP/lib/*:/Users/loretoparisi/Documents/Projects/CoreNLP/liblocal/*
java -mx6g -cp $CLASSPATH  edu.stanford.nlp.pipeline.StanfordCoreNLPServer

where I put in the CLASSPATH the CoreNLP libraries, models, tokenizer and ejml libraries compiled.

I have

$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

and the ejml 0.23 that you pointed to me seems not to be compiled against JVM8.

J38 commented 7 years ago

When you set your CLASSPATH you want to have

/Users/loretoparisi/Documents/Projects/CoreNLP/lib/*:/Users/loretoparisi/Documents/Projects/CoreNLP/liblocal/*

and you want to have a directory with these jars:

stanford-corenlp-models-current.jar stanford-corenlp.jar

And that is all you should have. If you put ejml stuff you compiled in .../Documents/CoreNLPDataset/* that could definitely cause problems. The ejml jar you want to use is in CoreNLP/lib/*.

So this CLASSPATH should work fine:

/Users/loretoparisi/Documents/Projects/CoreNLP/lib/*:/Users/loretoparisi/Documents/Projects/CoreNLP/liblocal/*:/path/to/stanford-corenlp-models-current.jar:/path/to/stanford-corenlp.jar

loretoparisi commented 7 years ago

@J38 Thanks a lot, this definitively solves this issue, now it is taking the internal jar, and it works!

[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---
[main] INFO CoreNLP - setting default constituency parser
[main] INFO CoreNLP - using SR parser: edu/stanford/nlp/models/srparser/englishSR.ser.gz
[main] INFO CoreNLP -     Threads: 8
[main] INFO CoreNLP - Starting server...
[main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000
[pool-1-thread-8] ERROR CoreNLP - Failure to load language specific properties.
[pool-1-thread-8] INFO CoreNLP - [/0:0:0:0:0:0:0:1:58441] API call w/annotators tokenize,ssplit,pos,parse,sentiment
I love you my life
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - No tokenizer type provided. Defaulting to PTBTokenizer.
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[pool-1-thread-8] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0,9 sec].
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[pool-1-thread-8] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/srparser/englishSR.ser.gz ... done [14,4 sec].
[pool-1-thread-8] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment