coli-saar / am-parser

Modular implementation of an AM dependency parser in AllenNLP.
Apache License 2.0
30 stars 10 forks source link

Wordnet exceptions when calling EvaluateAMR #60

Closed weissenh closed 5 years ago

weissenh commented 5 years ago

There were several exceptions that appeared when I tried to run EvaluateAMR.java with train.amconll as input, two examples:

Writing empty MRP graph instead
Error in line 1026220 with id bolt-eng-DF-170-181104-8737186_0011.4
edu.mit.jwi.data.IHasLifecycle$ObjectClosedException
        at edu.mit.jwi.DataSourceDictionary.checkOpen(DataSourceDictionary.java:121)
        at edu.mit.jwi.DataSourceDictionary.getExceptionEntry(DataSourceDictionary.java:308)
        at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:404)
        at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:392)
        at edu.mit.jwi.morph.WordnetStemmer.findStems(WordnetStemmer.java:80)
        at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.WordnetEnumerator.findNounStem(WordnetEnumerator.java:478)
        at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.ConceptnetEnumerator.findNounStem(ConceptnetEnumerator.java:265          )
        at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixLabel(Relabel.java:365)
        at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixGraph(Relabel.java:203)
        at de.saar.coli.amrtagging.mrp.amr.AMR.evaluate(AMR.java:121)
        at de.saar.coli.amrtagging.mrp.tools.EvaluateAMR.main(EvaluateAMR.java:88)
Writing empty MRP graph instead
Error in line 1026432 with id PROXY_AFP_ENG_20070620_0032.6
edu.mit.jwi.data.IHasLifecycle$ObjectClosedException
        at edu.mit.jwi.DataSourceDictionary.checkOpen(DataSourceDictionary.java:121)
        at edu.mit.jwi.DataSourceDictionary.getExceptionEntry(DataSourceDictionary.java:308)
        at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:404)
        at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:392)
        at edu.mit.jwi.morph.WordnetStemmer.findStems(WordnetStemmer.java:80)
        at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.WordnetEnumerator.findVerbStem(WordnetEnumerator.java:407)
        at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.ConceptnetEnumerator.findVerbStem(ConceptnetEnumerator.java:255          )
        at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixLabel(Relabel.java:350)
        at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixGraph(Relabel.java:203)
        at de.saar.coli.amrtagging.mrp.amr.AMR.evaluate(AMR.java:121)
        at de.saar.coli.amrtagging.mrp.tools.EvaluateAMR.main(EvaluateAMR.java:88)
weissenh commented 5 years ago

This command produced 296 edu.mit.jwi.data.IHasLifecycle$ObjectClosedException exceptions for me:

java -Xmx10g -cp build/libs/am-tools-all.jar de.saar.coli.amrtagging.mrp.tools.EvaluateAMR --corpus ~/shadow/train.amconll --wn /proj/irtg/amrtagging/amr-dependency-july2019/amr-dependency/resources/wordnet  --conceptnet /proj/irtg/amrtagging/amr-dependency-july2019/amr-dependency/resources/conceptnet-assertions-5.7.0.csv.gz --lookup ~/shadow/lookup/ --out ~/shadow/train_relabeled.mrp

The input amconll file consists of 50036 sentences, the sentence ids for which the exceptions occured were not consecutive in the input file. The first error messages:

Reading ConceptNet + Wordnet stemmer.
Reading ConceptNet from pickle in /proj/irtg/amrtagging/amr-dependency-july2019/amr-dependency/resources/conceptnet-assertions-5.7.0.csv.gz.pkl ...
Done: read pickle in 5.409s.
Error in line 1005572 with id bolt-eng-DF-170-181103-8889109_0015.21
edu.mit.jwi.data.IHasLifecycle$ObjectClosedException
    at edu.mit.jwi.DataSourceDictionary.checkOpen(DataSourceDictionary.java:121)
    at edu.mit.jwi.DataSourceDictionary.getExceptionEntry(DataSourceDictionary.java:308)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:404)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:392)
    at edu.mit.jwi.morph.WordnetStemmer.findStems(WordnetStemmer.java:80)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.WordnetEnumerator.findNounStem(WordnetEnumerator.java:478)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.ConceptnetEnumerator.findNounStem(ConceptnetEnumerator.java:265)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixLabel(Relabel.java:365)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixGraph(Relabel.java:203)
    at de.saar.coli.amrtagging.mrp.amr.AMR.evaluate(AMR.java:121)
    at de.saar.coli.amrtagging.mrp.tools.EvaluateAMR.main(EvaluateAMR.java:88)
Writing empty MRP graph instead

The last:

Error in line 1035520 with id nw.chtb_0134.2
edu.mit.jwi.data.IHasLifecycle$ObjectClosedException
    at edu.mit.jwi.DataSourceDictionary.checkOpen(DataSourceDictionary.java:121)
    at edu.mit.jwi.DataSourceDictionary.getExceptionEntry(DataSourceDictionary.java:308)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:404)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:392)
    at edu.mit.jwi.morph.WordnetStemmer.findStems(WordnetStemmer.java:80)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.WordnetEnumerator.findVerbStem(WordnetEnumerator.java:407)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.ConceptnetEnumerator.findVerbStem(ConceptnetEnumerator.java:255)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixLabel(Relabel.java:350)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixGraph(Relabel.java:203)
    at de.saar.coli.amrtagging.mrp.amr.AMR.evaluate(AMR.java:121)
    at de.saar.coli.amrtagging.mrp.tools.EvaluateAMR.main(EvaluateAMR.java:88)
Writing empty MRP graph instead
Error in line 1035658 with id bolt-eng-DF-201-185522-351953_0571.3
edu.mit.jwi.data.IHasLifecycle$ObjectClosedException
    at edu.mit.jwi.DataSourceDictionary.checkOpen(DataSourceDictionary.java:121)
    at edu.mit.jwi.DataSourceDictionary.getExceptionEntry(DataSourceDictionary.java:308)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:404)
    at edu.mit.jwi.RAMDictionary.getExceptionEntry(RAMDictionary.java:392)
    at edu.mit.jwi.morph.WordnetStemmer.findStems(WordnetStemmer.java:80)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.WordnetEnumerator.findNounStem(WordnetEnumerator.java:478)
    at de.saar.coli.amrtagging.formalisms.amr.tools.wordnet.ConceptnetEnumerator.findNounStem(ConceptnetEnumerator.java:265)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixLabel(Relabel.java:365)
    at de.saar.coli.amrtagging.formalisms.amr.tools.Relabel.fixGraph(Relabel.java:203)
    at de.saar.coli.amrtagging.mrp.amr.AMR.evaluate(AMR.java:121)
    at de.saar.coli.amrtagging.mrp.tools.EvaluateAMR.main(EvaluateAMR.java:88)
Writing empty MRP graph instead

I noticed that the resulting mrp file had very small graphs in the beginning (basically one node), and bigger ones at the end of the files .

alexanderkoller commented 5 years ago

Given the rare and unsystematic occurrence of this exception, I think the best approach is to simply handle it more gracefully for now. I changed the calls to the Wordnet stemmer such that they return null if this exception occurs; because this already happens with other Wordnet exceptions, the places in the code that call the stem lookup already know how to handle this and keep going. So at least we'll only get dummy node labels for the words and the rest of the graph can be evaluated.

alexanderkoller commented 5 years ago

Fixed in https://github.com/coli-saar/am-tools/commit/3a17b59c1efe7236aa2b380292960ede514b0abc.