NLPbox / stanford-corenlp-docker

build/run the most current Stanford CoreNLP server in a docker container
43 stars 32 forks source link

Does coreference annotator work in this image? #6

Open cbrew opened 2 years ago

cbrew commented 2 years ago

Have you managed to get the coref annotator to work? I'm seeing the image silently crashing when I bring up the GUI and add the coref annotator. Using latest corenlp, built as you suggest.

No crash with coref when I use the node package from https://github.com/gerardobort/node-corenlp, natively on an Intel Mac with openjdk-15.

wondering if we have a Java version issue, or a running out of memory, or something else.

arne-cl commented 2 years ago

I haven't used CoreNLP for coreference in a long time. If you can provide me with the exact commands that you ran, I'll try to reproduce the error.

cbrew commented 2 years ago

Launch the server, exactly as in your readme

Go to the GUI at localhost:9000.

Type any sentence and hit Submit. E.g. the dog walks. It works and does Brat visualizatons

Click to add the coreference annoitator. Hit submit again. It tries to load models, and the docker process quits without returning anything. The GUI puts a red bar on the screen of your browser.

Make sense? If not, I can try with curl or something. But it will be the same, pretty sure.

Chris

On Wed, Oct 20, 2021 at 9:54 AM Arne Neumann @.***> wrote:

I haven't used CoreNLP for coreference in a long time. If you can provide me with the exact commands that you ran, I'll try to reproduce the error.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NLPbox/stanford-corenlp-docker/issues/6#issuecomment-947690498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFIW2QW7J7PSKQ6DVOO6SLUH3CXFANCNFSM5GLVFG2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

cbrew commented 2 years ago

Get a server running on port 9000,

native is

npm explore corenlp -- npm run corenlp:server

You have to futz around a little, putting a CoreNLP distribution into npm_modules/corenlp/corenlp/stanford-core-nlp-4.3.0 and adjusting the server start script to match.

then do

wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .coref

Native server output is:

{
  "2": [
    {
      "id": 0,
      "text": "John",
      "type": "PROPER",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 1,
      "endIndex": 2,
      "headIndex": 1,
      "sentNum": 1,
      "position": [
        1,
        1
      ],
      "isRepresentativeMention": true
    },
    {
      "id": 1,
      "text": "he",
      "type": "PRONOMINAL",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 3,
      "endIndex": 4,
      "headIndex": 3,
      "sentNum": 1,
      "position": [
        1,
        2
      ],
      "isRepresentativeMention": false
    },
    {
      "id": 2,
      "text": "he",
      "type": "PRONOMINAL",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 7,
      "endIndex": 8,
      "headIndex": 7,
      "sentNum": 1,
      "position": [
        1,
        3
      ],
      "isRepresentativeMention": false
    }
  ]
}

Using a docker image of 4.3.0 made with your file

docker run -p 9000:9000 corenlp

In docker window

[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called --- [main] INFO CoreNLP - Server default properties: (Note: unspecified annotator properties are English defaults) annotators = tokenize,ssplit,parse inputFormat = text outputFormat = json prettyPrint = false [main] INFO CoreNLP - Threads: 8 [main] INFO CoreNLP - Starting server... [main] INFO CoreNLP - StanfordCoreNLPServer listening at /0.0.0.0:9000 [pool-1-thread-1] INFO CoreNLP - [/172.17.0.1:64090] API call w/annotators tokenize,ssplit,pos,lemma,ner,depparse,coref John said he would come but he did not [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos [pool-1-thread-1] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words-distsim.tagger ... done [1.1 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2.9 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.2 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [1.1 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1. [pool-1-thread-1] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: edu/stanford/nlp/models/sutime/defs.sutime.txt,edu/stanford/nlp/models/sutime/english.sutime.txt,edu/stanford/nlp/models/sutime/english.holidays.sutime.txt [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 580705 unique entries out of 581864 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab, 0 TokensRegex patterns. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 4867 unique entries out of 4867 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_cased.tab, 0 TokensRegex patterns. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 585572 unique entries from 2 files [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.NERCombinerAnnotator - numeric classifiers: true; SUTime: true [no docDate]; fine grained: true [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... Time elapsed: 1.2 sec [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 20000 vectors, elapsed Time: 2.701 sec [pool-1-thread-1] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [3.9 sec]. [pool-1-thread-1] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref (venv)  ~/working/examp/

And the docker process dies before returning anything.

In the query window, it goes like this. Which makes a good deal of sense given that the docker process dies.

` ~/Documents/GitHub/ wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .coref --2021-10-20 10:48:50-- http://localhost:9000/?properties=%7B%22annotators%22:%20%22coref%22,%20%22outputFormat%22:%20%22json%22%7D Resolving localhost (localhost)... ::1, 127.0.0.1 Connecting to localhost (localhost)|::1|:9000... connected. HTTP request sent, awaiting response... No data received. Retrying.

--2021-10-20 10:49:22-- (try: 2) http://localhost:9000/?properties=%7B%22annotators%22:%20%22coref%22,%20%22outputFormat%22:%20%22json%22%7D Connecting to localhost (localhost)|::1|:9000... failed: Connection refused. Connecting to localhost (localhost)|127.0.0.1|:9000... failed: Connection refused. Resolving localhost (localhost)... ::1, 127.0.0.1 Connecting to localhost (localhost)|::1|:9000... failed: Connection refused. Connecting to localhost (localhost)|127.0.0.1|:9000... failed: Connection refused. `

arne-cl commented 3 months ago

It seems that I added an ANNOTATORS env to the Dockerfile in 2021, but I never answered in this thread. I have changed the default to make CoreNLP always use all annotators, so running

docker buildx build -t corenlp https://github.com/NLPbox/stanford-corenlp-docker.git
docker run -p 9000:9000 corenlp

in one terminal and running your query in another should give you the desired result:

wget --post-data 'John said he would come but he did not' 'localhost:9000/?properties={"annotators": "coref", "outputFormat": "json"}' -O - | jq .corefs
{
  "2": [
    {
      "id": 0,
      "text": "John",
      "type": "PROPER",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 1,
      "endIndex": 2,
      "headIndex": 1,
      "sentNum": 1,
      "position": [
        1,
        1
      ],
      "isRepresentativeMention": true
    },
    {
      "id": 1,
      "text": "he",
      "type": "PRONOMINAL",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 3,
      "endIndex": 4,
      "headIndex": 3,
      "sentNum": 1,
      "position": [
        1,
        2
      ],
      "isRepresentativeMention": false
    },
    {
      "id": 2,
      "text": "he",
      "type": "PRONOMINAL",
      "number": "SINGULAR",
      "gender": "MALE",
      "animacy": "ANIMATE",
      "startIndex": 7,
      "endIndex": 8,
      "headIndex": 7,
      "sentNum": 1,
      "position": [
        1,
        3
      ],
      "isRepresentativeMention": false
    }
  ]
}