Closed ghost closed 11 years ago
8b728cbc8f227c0e0eba2bc641a2ffcdc59d226b 334a2d48edd59f954648a04fc1d980afab3bcaff
Alright, so these two last commits fix the bug that occurred due to the Stanford NLP team breaking their NER configuration API in the last version of their software. Now, it looks like there's a configuration problem with the SUTime package, which I don't know how to fix. I'm awaiting a response from their team.
:+1:
e64600df12d4ccf04930d5dc7051e74cb665454d 8cf1b51a7ef188381dd58395e3af7a9de231657f
Alright, everything is fixed now. I added some specs to make sure this doesn't break in the future, and pushed the version with the fixes as 0.4.3. You'll need to download this JAR file and put it in the /bin folder (it's now included in the latest packages).
Please note that from 0.4.3 on, JRuby 1.6.* is no longer supported. If you were using JRuby, you'll need to upgrade to 1.7.1.
Excellent, thanks for the quick response!
OK, for me it works with the minimal English zip, but not with the full zip. It breaks at :parse
. The minimal zip is fine for me, but I thought you'd might like to know and that this might be the best place to report it.
pipeline = StanfordCoreNLP.load(:tokenize, :ssplit, :pos, :lemma, :parse, :ner, :dcoref)
Adding annotator tokenize
Adding annotator ssplit
Adding annotator pos
Adding annotator lemma
Adding annotator parse
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ...
java.io.IOException: Unable to resolve "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:408)
at edu.stanford.nlp.io.IOUtils.readStreamFromString(IOUtils.java:356)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromSerializedFile(LexicalizedParser.java:530)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:328)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:148)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:134)
at edu.stanford.nlp.pipeline.ParserAnnotator.loadModel(ParserAnnotator.java:147)
at edu.stanford.nlp.pipeline.ParserAnnotator.<init>(ParserAnnotator.java:94)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$12.create(StanfordCoreNLP.java:777)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:80)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:301)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:145)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:141)
Loading parser from text file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz java.io.IOException: Unable to resolve "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz" as either class path, filename or URL
at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:408)
at edu.stanford.nlp.io.IOUtils.readReaderFromString(IOUtils.java:427)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromTextFile(LexicalizedParser.java:464)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:330)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:148)
at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:134)
at edu.stanford.nlp.pipeline.ParserAnnotator.loadModel(ParserAnnotator.java:147)
at edu.stanford.nlp.pipeline.ParserAnnotator.<init>(ParserAnnotator.java:94)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$12.create(StanfordCoreNLP.java:777)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:80)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:301)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:145)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:141)
NullPointerException: unknown exception
from /Users/thomas/.rvm/gems/ruby-1.9.3-p327@core-nlp/gems/stanford-core-nlp-0.4.3/lib/stanford-core-nlp.rb:176:in `new'
from /Users/thomas/.rvm/gems/ruby-1.9.3-p327@core-nlp/gems/stanford-core-nlp-0.4.3/lib/stanford-core-nlp.rb:176:in `load'
from (irb):11
from /Users/thomas/.rvm/rubies/ruby-1.9.3-p327/bin/irb:18:in `<main>'
Now fixed. Thanks for reporting.
I just tried to follow the instructions in the readme and hit something that looks like this same issue. I extracted the full zip and pasted into the gem's bin. Then I copied the code from the "Using the gem" section in the readme and ran it. Using ruby 1.9.3. My error is:
Loading classifier from /Users/nbrustein/code/german/edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... java.io.FileNotFoundException: edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz (No such file or directory)
Here is what my bin/ dir looks like. Am I doing something wrong?
ls /Users/nbrustein/.rvm/gems/ruby-1.9.3-p545@german/gems/stanford-core-nlp-0.5.1/bin total 20656 drwxr-xr-x 18 nbrustein staff 612B Aug 31 12:01 . drwxr-xr-x 6 nbrustein staff 204B Aug 31 11:53 .. -rw-r--r-- 1 nbrustein staff 851B Aug 31 11:53 AnnotationBridge.java -rw-r--r--@ 1 nbrustein staff 915B Aug 31 12:01 bridge.jar drwxr-xr-x@ 8 nbrustein staff 272B Aug 31 12:01 classifiers drwxr-xr-x@ 16 nbrustein staff 544B Aug 31 12:01 dcoref drwxr-xr-x@ 3 nbrustein staff 102B Aug 31 12:01 gender drwxr-xr-x@ 16 nbrustein staff 544B Aug 31 12:01 grammar -rw-r--r--@ 1 nbrustein staff 557K Aug 31 12:01 joda-time.jar -rw-r--r--@ 1 nbrustein staff 196K Aug 31 12:01 jollyday.jar drwxr-xr-x@ 3 nbrustein staff 102B Aug 31 12:01 regexner -rw-r--r--@ 1 nbrustein staff 4.2M Aug 31 12:01 stanford-corenlp.jar -rw-r--r--@ 1 nbrustein staff 2.4M Aug 31 12:01 stanford-parser.jar -rw-r--r--@ 1 nbrustein staff 2.4M Aug 31 12:01 stanford-segmenter.jar drwxr-xr-x@ 7 nbrustein staff 238B Aug 31 12:01 sutime drwxr-xr-x@ 33 nbrustein staff 1.1K Aug 31 12:01 taggers drwxr-xr-x@ 5 nbrustein staff 170B Aug 31 12:01 truecase -rw-r--r--@ 1 nbrustein staff 306K Aug 31 12:01 xom.jar
After fixing https://github.com/Organiz3r/stanford-core-nlp/commit/b47790389eea4a6ae7e6628873494c1be94caa33 :
When trying:
pipeline = StanfordCoreNLP.load(:tokenize, :ssplit, :pos, :lemma, :parse, :ner)
This happens:
Apparantly it looks for the models in the current directory (I was in
/User/thomas
) +edu/stanford/nlp/models/ner
. I have no idea why it's doing that, since it is able to find all the other CoreNLP files just fine, as evidenced the above output. It should be looking for the proper files in{gem path}/bin/classifiers/
, I think.