stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.58k stars 2.7k forks source link

Error occurs when adding annotator 'kbp' for Chinese language #958

Open stone-ts15 opened 4 years ago

stone-ts15 commented 4 years ago

I'm using KBP for relation extraction in Chinese. There is currently a model for Chinese according to the official introduction. I added kbp annotator into StanfordCoreNLP-chinese.properties . When I ran the client with python interface, the error below occurred:

Starting server with command: java -Xmx6G -cp %CORENLP_HOME%/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 300000 -threads 8 -maxCharLength 100000 -quiet False -serverProperties StanfordCoreNLP-chinese.properties -preload tokenize,ssplit,pos,lemma,ner,parse,coref,kbp
[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---
[main] INFO CoreNLP - setting default constituency parser
[main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz
[main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead
[main] INFO CoreNLP - to use shift reduce parser download English models jar from:
[main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html
[main] INFO CoreNLP -     Threads: 8
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [10.3 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [0.8 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [5.2 sec].
[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 21238 unique entries out of 21249 from edu/stanford/nlp/models/kbp/chinese/gazetteers/cn_regexner_mapping.tab, 0 TokensRegex patterns.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/srparser/chineseSR.ser.gz ... done [19.0 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
[main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator kbp
[main] ERROR CoreNLP - Could not pre-load annotators in server; encountered exception:
java.util.regex.PatternSyntaxException: Unclosed character class near index 3
["鈥漖
   ^
        at java.util.regex.Pattern.error(Unknown Source)
        at java.util.regex.Pattern.clazz(Unknown Source)
        at java.util.regex.Pattern.sequence(Unknown Source)
        at java.util.regex.Pattern.expr(Unknown Source)
        at java.util.regex.Pattern.compile(Unknown Source)
        at java.util.regex.Pattern.<init>(Unknown Source)
        at java.util.regex.Pattern.compile(Unknown Source)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:81)
        at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:47)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Description(SemgrexParser.java:543)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Child(SemgrexParser.java:440)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModNode(SemgrexParser.java:415)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Relation(SemgrexParser.java:329)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelChild(SemgrexParser.java:230)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModRelation(SemgrexParser.java:195)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationConj(SemgrexParser.java:176)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationDisj(SemgrexParser.java:123)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.SubNode(SemgrexParser.java:103)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Root(SemgrexParser.java:34)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexPattern.compile(SemgrexPattern.java:291)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.parse(SemgrexBatchParser.java:57)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:47)
        at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:39)
        at edu.stanford.nlp.ie.KBPSemgrexExtractor.<init>(KBPSemgrexExtractor.java:56)
        at edu.stanford.nlp.pipeline.KBPAnnotator.<init>(KBPAnnotator.java:115)
        at edu.stanford.nlp.pipeline.AnnotatorImplementations.kbp(AnnotatorImplementations.java:290)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$25(StanfordCoreNLP.java:543)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$30(StanfordCoreNLP.java:602)
        at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
        at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
        at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:251)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192)
        at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188)
        at edu.stanford.nlp.pipeline.StanfordCoreNLPServer.main(StanfordCoreNLPServer.java:1505)

I have downloaded the model for Chinese and got an NER result. Does anybody know the reason for this error?

AngledLuffa commented 4 years ago

What is the output if you run this command? (Note that you might need a -cp parameter or set your CLASSPATH for your own configuration)

java edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 300000 -threads 8 -maxCharLength 100000 -quiet False -serverProperties StanfordCoreNLP-chinese.properties -preload tokenize,ssplit,pos,lemma,ner,parse,coref,kbp

On Sat, Oct 26, 2019 at 3:18 AM Stone notifications@github.com wrote:

I'm using KBP for relation extraction in Chinese language. There is currently models for Chinese according to the official introduction. I modified StanfordCoreNLP-chinese.properties to add kbp annotator. When executing the client with python interface, the error below occurs:

Starting server with command: java -Xmx6G -cp %CORENLP_HOME%/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 300000 -threads 8 -maxCharLength 100000 -quiet False -serverProperties StanfordCoreNLP-chinese.properties -preload tokenize,ssplit,pos,lemma,ner,parse,coref,kbp

[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---

[main] INFO CoreNLP - setting default constituency parser

[main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz

[main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead

[main] INFO CoreNLP - to use shift reduce parser download English models jar from:

[main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html

[main] INFO CoreNLP - Threads: 8

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize

[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [10.3 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos

[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [0.8 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner

[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [5.2 sec].

[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 21238 unique entries out of 21249 from edu/stanford/nlp/models/kbp/chinese/gazetteers/cn_regexner_mapping.tab, 0 TokensRegex patterns.

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse

[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/srparser/chineseSR.ser.gz ... done [19.0 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref

[main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator kbp

[main] ERROR CoreNLP - Could not pre-load annotators in server; encountered exception:

java.util.regex.PatternSyntaxException: Unclosed character class near index 3

["鈥漖

^

    at java.util.regex.Pattern.error(Unknown Source)

    at java.util.regex.Pattern.clazz(Unknown Source)

    at java.util.regex.Pattern.sequence(Unknown Source)

    at java.util.regex.Pattern.expr(Unknown Source)

    at java.util.regex.Pattern.compile(Unknown Source)

    at java.util.regex.Pattern.<init>(Unknown Source)

    at java.util.regex.Pattern.compile(Unknown Source)

    at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:81)

    at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:47)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Description(SemgrexParser.java:543)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Child(SemgrexParser.java:440)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModNode(SemgrexParser.java:415)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Relation(SemgrexParser.java:329)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelChild(SemgrexParser.java:230)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModRelation(SemgrexParser.java:195)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationConj(SemgrexParser.java:176)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationDisj(SemgrexParser.java:123)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.SubNode(SemgrexParser.java:103)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Root(SemgrexParser.java:34)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexPattern.compile(SemgrexPattern.java:291)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.parse(SemgrexBatchParser.java:57)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:47)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:39)

    at edu.stanford.nlp.ie.KBPSemgrexExtractor.<init>(KBPSemgrexExtractor.java:56)

    at edu.stanford.nlp.pipeline.KBPAnnotator.<init>(KBPAnnotator.java:115)

    at edu.stanford.nlp.pipeline.AnnotatorImplementations.kbp(AnnotatorImplementations.java:290)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$25(StanfordCoreNLP.java:543)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$30(StanfordCoreNLP.java:602)

    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)

    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)

    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:251)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188)

    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer.main(StanfordCoreNLPServer.java:1505)

I have downloaded the model for Chinese and NER result is fine. Any reason for this error?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/958?email_source=notifications&email_token=AA2AYWMRTOQDEBAL6NKD4G3QQQKPNA5CNFSM4JFMXK72YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HUQ7XCQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWPEXSBST7YY4ABLTMTQQQKPNANCNFSM4JFMXK7Q .

stone-ts15 commented 4 years ago

@AngledLuffa When I ran this command directly (I set memory=6G ), the same error occurred, like the output above.

AngledLuffa commented 4 years ago

Have you edited or in any way changed the kbp data? It loads fine out of the box for me.

Apologies for the long delay in replying.

On Sat, Oct 26, 2019 at 3:18 AM Stone notifications@github.com wrote:

I'm using KBP for relation extraction in Chinese language. There is currently models for Chinese according to the official introduction. I modified StanfordCoreNLP-chinese.properties to add kbp annotator. When executing the client with python interface, the error below occurs:

Starting server with command: java -Xmx6G -cp %CORENLP_HOME%/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 300000 -threads 8 -maxCharLength 100000 -quiet False -serverProperties StanfordCoreNLP-chinese.properties -preload tokenize,ssplit,pos,lemma,ner,parse,coref,kbp

[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---

[main] INFO CoreNLP - setting default constituency parser

[main] INFO CoreNLP - warning: cannot find edu/stanford/nlp/models/srparser/englishSR.ser.gz

[main] INFO CoreNLP - using: edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz instead

[main] INFO CoreNLP - to use shift reduce parser download English models jar from:

[main] INFO CoreNLP - http://stanfordnlp.github.io/CoreNLP/download.html

[main] INFO CoreNLP - Threads: 8

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize

[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/segmenter/chinese/ctb.gz ... done [10.3 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos

[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/chinese-distsim/chinese-distsim.tagger ... done [0.8 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner

[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/chinese.misc.distsim.crf.ser.gz ... done [5.2 sec].

[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 21238 unique entries out of 21249 from edu/stanford/nlp/models/kbp/chinese/gazetteers/cn_regexner_mapping.tab, 0 TokensRegex patterns.

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse

[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/srparser/chineseSR.ser.gz ... done [19.0 sec].

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref

[main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator kbp

[main] ERROR CoreNLP - Could not pre-load annotators in server; encountered exception:

java.util.regex.PatternSyntaxException: Unclosed character class near index 3

["鈥漖

^

    at java.util.regex.Pattern.error(Unknown Source)

    at java.util.regex.Pattern.clazz(Unknown Source)

    at java.util.regex.Pattern.sequence(Unknown Source)

    at java.util.regex.Pattern.expr(Unknown Source)

    at java.util.regex.Pattern.compile(Unknown Source)

    at java.util.regex.Pattern.<init>(Unknown Source)

    at java.util.regex.Pattern.compile(Unknown Source)

    at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:81)

    at edu.stanford.nlp.semgraph.semgrex.NodePattern.<init>(NodePattern.java:47)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Description(SemgrexParser.java:543)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Child(SemgrexParser.java:440)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModNode(SemgrexParser.java:415)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Relation(SemgrexParser.java:329)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelChild(SemgrexParser.java:230)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.ModRelation(SemgrexParser.java:195)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationConj(SemgrexParser.java:176)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.RelationDisj(SemgrexParser.java:123)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.SubNode(SemgrexParser.java:103)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexParser.Root(SemgrexParser.java:34)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexPattern.compile(SemgrexPattern.java:291)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.parse(SemgrexBatchParser.java:57)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:47)

    at edu.stanford.nlp.semgraph.semgrex.SemgrexBatchParser.compileStream(SemgrexBatchParser.java:39)

    at edu.stanford.nlp.ie.KBPSemgrexExtractor.<init>(KBPSemgrexExtractor.java:56)

    at edu.stanford.nlp.pipeline.KBPAnnotator.<init>(KBPAnnotator.java:115)

    at edu.stanford.nlp.pipeline.AnnotatorImplementations.kbp(AnnotatorImplementations.java:290)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$25(StanfordCoreNLP.java:543)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$30(StanfordCoreNLP.java:602)

    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)

    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)

    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:251)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192)

    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188)

    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer.main(StanfordCoreNLPServer.java:1505)

I have downloaded the model for Chinese and NER result is fine. Any reason for this error?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/958?email_source=notifications&email_token=AA2AYWMRTOQDEBAL6NKD4G3QQQKPNA5CNFSM4JFMXK72YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HUQ7XCQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWPEXSBST7YY4ABLTMTQQQKPNANCNFSM4JFMXK7Q .

J38 commented 4 years ago

What version of Stanford CoreNLP are you running? What Java are you using?

stone-ts15 commented 4 years ago

Thanks for @AngledLuffa's reply! Sorry I don't know where the kbp data is. I downloaded the jar file for Chinese and copied it to %CORENLP_HOME%. Version of Stanford CoreNLP I use is 3.9.2. I use Java 8u231 64-bit. Does the Windows 10 operating system matter?

AngledLuffa commented 4 years ago

I honestly have no idea what could be causing this problem.

I ran this command:

java -cp * edu.stanford.nlp.pipeline.StanfordCoreNLP -properties StanfordCoreNLP-chinese.properties -annotators "tokenize,ssplit,pos,lemma,ner,parse,coref,kbp"

No problems. I then tried this:

java -cp * edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 300000 -threads 8 -maxCharLength 100000 -quiet False -serverProperties StanfordCoreNLP-chinese.properties -preload tokenize,ssplit,pos,lemma,ner,parse,coref,kbp ... snip ... [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref [main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator kbp [main] INFO CoreNLP - Starting server... [main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000

This is on Windows 10. As it turns out, I'm using java 12.0.1. However, unless there are some encoding changes between versions, I'm not sure how that will affect things.

Can you try this with a clean download?

Without further information I think this is "cannot reproduce".

On Mon, Nov 11, 2019 at 1:21 AM Stone notifications@github.com wrote:

Thanks for @AngledLuffa https://github.com/AngledLuffa's reply! Sorry I don't know where the kbp data is. I downloaded the jar file for Chinese and copied it to %CORENLP_HOME%. Version of Stanford CoreNLP I use is 3.9.2. I use Java 8u231 64-bit. Does the Windows 10 operating system matter?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/958?email_source=notifications&email_token=AA2AYWOX2ACXUULEJYSNW63QTEP2XA5CNFSM4JFMXK72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDWFOXY#issuecomment-552359775, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWINGCDLWUNNBJSRLULQTEP2XANCNFSM4JFMXK7Q .

J38 commented 4 years ago

java.util.regex has changed across Java versions. That being said, I seem to be able to run a basic Chinese KBP pipeline with Java 8 and Java 11.

J38 commented 4 years ago

It seems like the error is in reading in the semgrex pattern files, and perhaps there is some issue with the Java version on Windows. I only have access to macOS and Ubuntu systems to test out on.

J38 commented 4 years ago

You might try upgrading Java and seeing if that helps...

J38 commented 4 years ago

My two successes are:

macOS, Java 11.0.1 Ubuntu, Java 1.8.0_172

stone-ts15 commented 4 years ago

I tried to run this command on Ubuntu system with Java 13, and it worked fine. I believe this is an issue with the Windows system.