stanfordnlp / CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
http://stanfordnlp.github.io/CoreNLP/
GNU General Public License v3.0
9.63k stars 2.7k forks source link

Coreference failure #905

Closed BvdM0 closed 4 years ago

BvdM0 commented 5 years ago

I've been using the QuoteAttributor on news articles, and 26% of quotes returned have been assigned with speaker 'unknown'. I've checked out and re-run a sample of the failed quote, and I've pasted some of the problems I found below. I'm not sure whether this is a result of my misunderstanding, a technical problem at my end or an issue with the software. I've been running CoreNLP on .txt files (returning .json) from Terminal with the command:

java -cp "*" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat json

Any help would be greatly appreciated!

Case 1: Descriptive clause 1 Including a descriptive clause after the name, in a case where there is a paragraph break within the quote, leads to failure of attribution (canonical speaker = ‘he’).

Failure: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Success: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Case 2: Descriptive clause 2 Same issue, but with a sentence break between name and quote start, instead of a paragraph break within the quote.

Failure: Last night Khalid Mahmood, a Labour MP, said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same,” he said.

Success: Last night Khalid Mahmood said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same," he said.

Case 3: Forgetting first name Returns speaker as ‘Smith’ not ‘John Smith’.

John Smith was banned from any involvement in schools after the so-called Trojan horse scandal.

Mr Smith said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 4: Titles Returns speaker as ‘his’, not ‘Alam’ or ‘Mr Alam’

Failure (speaker = ‘his’): Mr Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Success (speaker = ‘Tahir Alam’): Tahir Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 5: Repeated use of pronoun Second quote returns speaker ‘unknown’

Jack Letts has told how he wanted to be a suicide bomber.

"I know I was definitely an enemy of Britain," he told the BBC. "If there was a battle, I'm ready", he added.

AngledLuffa commented 5 years ago

I've gotten as far as figuring out that this is a coref issue, not a QuoteAnnotator issue.

How to solve it, not yet sure.

For example:


text:

Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

no correlation between "he" and "Hugh Hill"

quote:

attributed to "he"

vs


text: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill"

quote:

attributed to "Hugh Hill"

On Wed, Jul 10, 2019 at 6:18 AM BvdM0 notifications@github.com wrote:

I've been using the QuoteAttributor on news articles, and 26% of quotes returned have been assigned with speaker 'unknown'. I've checked out and re-run a sample of the failed quote, and I've pasted some of the problems I found below. I'm not sure whether this is a result of my misunderstanding, a technical problem at my end or an issue with the software. I've been running CoreNLP on .txt files (returning .json) from Terminal with the command:

java -cp "*" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat text

Any help would be greatly appreciated!

Case 1: Descriptive clause 1 Including a descriptive clause after the name, in a case where there is a paragraph break within the quote, leads to failure of attribution (canonical speaker = ‘he’).

Failure: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Success: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Case 2: Descriptive clause 2 Same issue, but with a sentence break between name and quote start, instead of a paragraph break within the quote.

Failure: Last night Khalid Mahmood, a Labour MP, said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same,” he said.

Success: Last night Khalid Mahmood said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same," he said.

Case 3: Forgetting first name Returns speaker as ‘Smith’ not ‘John Smith’.

John Smith was banned from any involvement in schools after the so-called Trojan horse scandal.

Mr Smith said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 4: Titles Returns speaker as ‘his’, not ‘Alam’ or ‘Mr Alam’

Failure (speaker = ‘his’): Mr Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Success (speaker = ‘Tahir Alam’): Tahir Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 5: Repeated use of pronoun Second quote returns speaker ‘unknown’

Jack Letts has told how he wanted to be a suicide bomber.

"I know I was definitely an enemy of Britain," he told the BBC. "If there was a battle, I'm ready", he added.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/905?email_source=notifications&email_token=AA2AYWLZ2SJCFAZCC4KG4GTP6XOSRA5CNFSM4H7PIJ4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G6LHZ5A, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWPD7RBJNQUOZQYA3PLP6XOSRANCNFSM4H7PIJ4A .

AngledLuffa commented 5 years ago

Oof, cut & paste error.

The second example should not have "a doctor"


text: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill"

quote:

attributed to "Hugh Hill"

On Thu, Jul 11, 2019 at 1:25 PM John Bauer horatio@gmail.com wrote:

I've gotten as far as figuring out that this is a coref issue, not a QuoteAnnotator issue.

How to solve it, not yet sure.

For example:


text:

Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

no correlation between "he" and "Hugh Hill"

quote:

attributed to "he"

vs


text: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill"

quote:

attributed to "Hugh Hill"

On Wed, Jul 10, 2019 at 6:18 AM BvdM0 notifications@github.com wrote:

I've been using the QuoteAttributor on news articles, and 26% of quotes returned have been assigned with speaker 'unknown'. I've checked out and re-run a sample of the failed quote, and I've pasted some of the problems I found below. I'm not sure whether this is a result of my misunderstanding, a technical problem at my end or an issue with the software. I've been running CoreNLP on .txt files (returning .json) from Terminal with the command:

java -cp "*" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat text

Any help would be greatly appreciated!

Case 1: Descriptive clause 1 Including a descriptive clause after the name, in a case where there is a paragraph break within the quote, leads to failure of attribution (canonical speaker = ‘he’).

Failure: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Success: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Case 2: Descriptive clause 2 Same issue, but with a sentence break between name and quote start, instead of a paragraph break within the quote.

Failure: Last night Khalid Mahmood, a Labour MP, said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same,” he said.

Success: Last night Khalid Mahmood said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same," he said.

Case 3: Forgetting first name Returns speaker as ‘Smith’ not ‘John Smith’.

John Smith was banned from any involvement in schools after the so-called Trojan horse scandal.

Mr Smith said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 4: Titles Returns speaker as ‘his’, not ‘Alam’ or ‘Mr Alam’

Failure (speaker = ‘his’): Mr Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Success (speaker = ‘Tahir Alam’): Tahir Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 5: Repeated use of pronoun Second quote returns speaker ‘unknown’

Jack Letts has told how he wanted to be a suicide bomber.

"I know I was definitely an enemy of Britain," he told the BBC. "If there was a battle, I'm ready", he added.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/905?email_source=notifications&email_token=AA2AYWLZ2SJCFAZCC4KG4GTP6XOSRA5CNFSM4H7PIJ4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G6LHZ5A, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWPD7RBJNQUOZQYA3PLP6XOSRANCNFSM4H7PIJ4A .

BvdM0 commented 5 years ago

I've gotten as far as figuring out that this is a coref issue, not a QuoteAnnotator issue. How to solve it, not yet sure. For example: --- text: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue. "If they don't get it they'll die," he added. coref: no correlation between "he" and "Hugh Hill" quote: attributed to "he" --- vs --- text: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue. "If they don't get it they'll die," he added. coref: Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill" quote: attributed to "Hugh Hill" --- On Wed, Jul 10, 2019 at 6:18 AM BvdM0 @.**> wrote: I've been using the QuoteAttributor on news articles, and 26% of quotes returned have been assigned with speaker 'unknown'. I've checked out and re-run a sample of the failed quote, and I've pasted some of the problems I found below. I'm not sure whether this is a result of my misunderstanding, a technical problem at my end or an issue with the software. I've been running CoreNLP on .txt files (returning .json) from Terminal with the command: java -cp "" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat text Any help would be greatly appreciated! Case 1: Descriptive clause 1 Including a descriptive clause after the name, in a case where there is a paragraph break within the quote, leads to failure of attribution (canonical speaker = ‘he’). Failure: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue. "If they don't get it they'll die," he added. Success: Hugh Hill said: "They need urgent treatment to fix the issue. "If they don't get it they'll die," he added. Case 2: Descriptive clause 2 Same issue, but with a sentence break between name and quote start, instead of a paragraph break within the quote. Failure: Last night Khalid Mahmood, a Labour MP, said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same,” he said. Success: Last night Khalid Mahmood said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same," he said. Case 3: Forgetting first name Returns speaker as ‘Smith’ not ‘John Smith’. John Smith was banned from any involvement in schools after the so-called Trojan horse scandal. Mr Smith said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children." Case 4: Titles Returns speaker as ‘his’, not ‘Alam’ or ‘Mr Alam’ Failure (speaker = ‘his’): Mr Alam confirmed that the seminars were the first to be held by his association since the scandal. He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children." Success (speaker = ‘Tahir Alam’): Tahir Alam confirmed that the seminars were the first to be held by his association since the scandal. He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children." Case 5: Repeated use of pronoun Second quote returns speaker ‘unknown’ Jack Letts has told how he wanted to be a suicide bomber. "I know I was definitely an enemy of Britain," he told the BBC. "If there was a battle, I'm ready", he added. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#905?email_source=notifications&email_token=AA2AYWLZ2SJCFAZCC4KG4GTP6XOSRA5CNFSM4H7PIJ4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G6LHZ5A>, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWPD7RBJNQUOZQYA3PLP6XOSRANCNFSM4H7PIJ4A .

Ah, that makes sense. I've amended the title accordingly.

AngledLuffa commented 5 years ago

On further investigating, and speaking with the author of the module in question, the issue is exactly what you discovered: the system handles "person" and "person, a type of person" differently. The best solution is to gather up more training data and use that to improve the coref.

There are some newer datasets which we hope to add to the coref models. Hopefully that will fix this error; until then, it's probably going to stick around for a while.

On Thu, Jul 11, 2019 at 1:26 PM John Bauer horatio@gmail.com wrote:

Oof, cut & paste error.

The second example should not have "a doctor"


text: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill"

quote:

attributed to "Hugh Hill"

On Thu, Jul 11, 2019 at 1:25 PM John Bauer horatio@gmail.com wrote:

I've gotten as far as figuring out that this is a coref issue, not a QuoteAnnotator issue.

How to solve it, not yet sure.

For example:


text:

Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

no correlation between "he" and "Hugh Hill"

quote:

attributed to "he"

vs


text: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

coref:

Coreference set: (2,13,[13,14]) -> (1,2,[1,3]), that is: "he" -> "Hugh Hill"

quote:

attributed to "Hugh Hill"

On Wed, Jul 10, 2019 at 6:18 AM BvdM0 notifications@github.com wrote:

I've been using the QuoteAttributor on news articles, and 26% of quotes returned have been assigned with speaker 'unknown'. I've checked out and re-run a sample of the failed quote, and I've pasted some of the problems I found below. I'm not sure whether this is a result of my misunderstanding, a technical problem at my end or an issue with the software. I've been running CoreNLP on .txt files (returning .json) from Terminal with the command:

java -cp "*" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat text

Any help would be greatly appreciated!

Case 1: Descriptive clause 1 Including a descriptive clause after the name, in a case where there is a paragraph break within the quote, leads to failure of attribution (canonical speaker = ‘he’).

Failure: Hugh Hill, a doctor, said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Success: Hugh Hill said: "They need urgent treatment to fix the issue.

"If they don't get it they'll die," he added.

Case 2: Descriptive clause 2 Same issue, but with a sentence break between name and quote start, instead of a paragraph break within the quote.

Failure: Last night Khalid Mahmood, a Labour MP, said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same,” he said.

Success: Last night Khalid Mahmood said he was worried. "What he has done previously is bring the whole community into disrepute and what he is doing now, whether it is legal or not, will do the same," he said.

Case 3: Forgetting first name Returns speaker as ‘Smith’ not ‘John Smith’.

John Smith was banned from any involvement in schools after the so-called Trojan horse scandal.

Mr Smith said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 4: Titles Returns speaker as ‘his’, not ‘Alam’ or ‘Mr Alam’

Failure (speaker = ‘his’): Mr Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Success (speaker = ‘Tahir Alam’): Tahir Alam confirmed that the seminars were the first to be held by his association since the scandal.

He said: "I'm a trainer by profession so I want to use my skill to benefit the parents in educating their children."

Case 5: Repeated use of pronoun Second quote returns speaker ‘unknown’

Jack Letts has told how he wanted to be a suicide bomber.

"I know I was definitely an enemy of Britain," he told the BBC. "If there was a battle, I'm ready", he added.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/905?email_source=notifications&email_token=AA2AYWLZ2SJCFAZCC4KG4GTP6XOSRA5CNFSM4H7PIJ4KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G6LHZ5A, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWPD7RBJNQUOZQYA3PLP6XOSRANCNFSM4H7PIJ4A .

BvdM0 commented 5 years ago

Thanks! Do you know any other NLP software which might be better at handling that kind of sentence structure? I'm working with news articles so that's extremely common, and I'm very much a beginner with coding so would rather have a pre-fab solution if possible.

BvdM0 commented 5 years ago

Do you think it's possible to integrate HuggingFace's neuralcoref with CoreNLP's QuoteAttributor?

AngledLuffa commented 5 years ago

I personally don't know anything about that coref system, but you can go in one of two directions:

1) run HuggingFace first, massage the input into something that fits QuoteAnnotator, call QuoteAnnotator directly

2) create a custom annotator which runs HuggingFace as part of corenlp:

https://stanfordnlp.github.io/CoreNLP/new_annotator.html

Either way, you'd need to dissect the annotations as it is before the call to QuoteAnnotator to figure out what it needs. In other words, try using the API without the QuoteAnnotator to see what annotations are set

John

On Thu, Jul 18, 2019 at 4:04 AM BvdM0 notifications@github.com wrote:

Do you think it's possible to integrate HuggingFace's neuralcoref with CoreNLP's QuoteAttributor?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/CoreNLP/issues/905?email_source=notifications&email_token=AA2AYWMQO54HTVDEUWCXTYTQABE35A5CNFSM4H7PIJ4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2IEDBQ#issuecomment-512770438, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2AYWMDDL7KIKYO5IVJQVDQABE35ANCNFSM4H7PIJ4A .

BvdM0 commented 5 years ago

Thanks! I decided to add a custom annotator but I'm having difficulty with the example.

In my main directory I added sampleProps.properties and CustomLemmaAnnotator.java just as in the example. However, when I run:

java -cp "*" -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -props sampleProps.properties -file input.txt -outputFormat text

I get the following error:

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator custom.lemma
Exception in thread "main" edu.stanford.nlp.util.MetaClass$ClassCreationException: java.lang.ClassNotFoundException: edu.stanford.nlp.examples.CustomLemmaAnnotator
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:364)
    at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:381)
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.custom(AnnotatorImplementations.java:141)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$28(StanfordCoreNLP.java:583)
    at edu.stanford.nlp.util.Lazy$3.compute(Lazy.java:126)
    at edu.stanford.nlp.util.Lazy.get(Lazy.java:31)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:149)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:251)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:192)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:188)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1388)
Caused by: java.lang.ClassNotFoundException: edu.stanford.nlp.examples.CustomLemmaAnnotator
    at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
    at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
    at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
    at java.base/java.lang.Class.forName0(Native Method)
    at java.base/java.lang.Class.forName(Class.java:332)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.construct(MetaClass.java:135)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:202)
    at edu.stanford.nlp.util.MetaClass$ClassFactory.<init>(MetaClass.java:69)
    at edu.stanford.nlp.util.MetaClass.createFactory(MetaClass.java:360)
    ... 10 more

Do you know what the problem might be? Are the file types wrong, or are they in the wrong location? I'm completely new to java so I'm completely stuck.

Thanks again for your help, Ben

J38 commented 5 years ago

If you're using the 3.9.2 distribution you need to put CustomLemmaAnnotator.java into a directory like edu/stanford/nlp/examples and compile it.

If you're going to be writing custom classes, you should probably clone the repo, and add CustomLemmaAnnotator.java (or whatever your custom classes are) into the project the way the other classes are and rebuild the project. There are instructions on the front page about building the project.

If you just want to get your example working:

# in the 3.9.2 distribution directory
mkdir -p edu/stanford/nlp/examples
cp CustomLemmaAnnotator.java edu/stanford/nlp/examples
javac edu/stanford/nlp/examples/CustomLemmaAnnotator.java 
J38 commented 5 years ago

Here are some basic resources about Java:

https://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html

https://users.soe.ucsc.edu/~eaugusti/archive/102-winter16/misc/howToCompileAndRunFromCommandLine.html

J38 commented 5 years ago

If you're going to be writing a fair amount of Java code for future projects, you might install ant so you can just integrate your code into the overall project.

https://ant.apache.org/manual/install.html

BvdM0 commented 5 years ago

Stupid question - where is the directory edu/stanford/nlp/examples? I’ve been working entirely in the location where I extracted the original download. Am I supposed to create it in there, or is it somewhere else?

On Fri, Jul 19, 2019 at 20:50, J38 notifications@github.com wrote:

If you're using the 3.9.2 distribution you need to put CustomLemmaAnnotator.java into a directory like edu/stanford/nlp/examples and compile it.

If you're going to be writing custom classes, you should probably clone the repo, and add CustomLemmaAnnotator.java (or whatever your custom classes are) into the project the way the other classes are and rebuild the project. There are instructions on the front page about building the project.

If you just want to get your example working:

in the 3.9.2 distribution directorymkdir -p edu/stanford/nlp/examplescp CustomLemmaAnnotator.java edu/stanford/nlp/examplesjavac edu/stanford/nlp/examples

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

J38 commented 5 years ago

You should be inside stanford-corenlp-full-2018-10-05, the top level directory.

edu/stanford/nlp/examples should be a sibling of stanford-corenlp-3.9.2.jar for example.