stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.67k stars 2.7k forks source link

Problem running CorefAnnotator on CoNLL files #895

Closed rahular closed 5 years ago

rahular commented 5 years ago

I am following the guide here and trying to evaluate the statistical/neural coreference annotator on CoNLL 2012 files. The jar linked in the page does not contain neural/english-model-conll.ser.gz and statistical/*_conll.ser.gz, but this jar does.

After figuring that out, when I try to run the annotator using the command given in the page, I get the following output:

└─$ java -Xmx6g -cp stanford-corenlp-3.7.0.jar:stanford-english-corenlp-models-3.7.0.jar:* edu.stanford.nlp.coref.CorefSystem -props edu/stanford/nlp/coref/properties/neural-english-conll.properties -coref.data ~/ann-coref/test.conll -coref.conllOutputPath ./logs -coref.scorer ~/conll-2012/scorer/v8.01/scorer.pl
[main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref model edu/stanford/nlp/models/coref/neural/english-model-conll.ser.gz ... done [0.5 sec].
[main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref embeddings edu/stanford/nlp/models/coref/neural/english-embeddings.ser.gz ... done [0.6 sec].
[main] INFO CoreNLP - Identification of Mentions: Recall: (0 / 0) 0%    Precision: (0 / 0) 0%   F1: 0%
[main] INFO CoreNLP - METRIC muc:Coreference: Recall: (0 / 0) 0%    Precision: (0 / 0) 0%   F1: 0%
METRIC bcub:Coreference: Recall: (0 / 0) 0% Precision: (0 / 0) 0%   F1: 0%
METRIC ceafm:Coreference: Recall: (0 / 0) 0%    Precision: (0 / 0) 0%   F1: 0%
METRIC ceafe:Coreference: Recall: (0 / 0) 0%    Precision: (0 / 0) 0%   F1: 0%
METRIC blanc:Coreference links: Recall: (0 / 0) 0%  Precision: (0 / 0) 0%   F1: 0%
Non-coreference links: Recall: (0 / 0) 0%   Precision: (0 / 0) 0%   F1: 0%
BLANC: Recall: (0 / 1) 0%   Precision: (0 / 1) 0%   F1: 0%
[main] INFO CoreNLP - Final conll score ((muc+bcub+ceafe)/3) = 0

Looks like the model is not predicting anything and hence all the scores are 0.

Any idea what might be the issue here?

J38 commented 5 years ago

If you want to run on the CoNLL 2012 files, you should have a directory structure such as /Users/myusername/conll-2012/v9 and coref.data should be set to /Users/myusername/conll-2012. Your output suggests none of the data is being loaded.

CoNLL 2012 data can be found here: http://conll.cemantix.org/2012/data.html