google-code-export / dkpro-core-asl

Automatically exported from code.google.com/p/dkpro-core-asl
0 stars 0 forks source link

Proplem with Clearnlp SRL #553

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. I used clearnlp within dkpro 1.7, model variant is Ontonotes
2. I ran a pipeline with ClearNlpSegmenter, ClearNlpLemmatizer, 
ClearNlpDependencyParser, ClearNlpSemanticRoleLabeler
3. Then I got exception when processing this sentence: "The Wasilla Bible 
Church in Wasilla, Alaska was apparently attacked by an arsonist last night." 
If I block the SRL, it works

What is the expected output? What do you see instead?
...
Caused by: java.lang.NullPointerException
    at com.clearnlp.dependency.DEPLibEn.getTopVerbChain(DEPLibEn.java:373)
    at com.clearnlp.dependency.DEPLibEn.labelReferentOfRelativeClause(DEPLibEn.java:354)
    at com.clearnlp.dependency.DEPLibEn.postLabel(DEPLibEn.java:300)
    at com.clearnlp.component.srl.EnglishSRLabeler.postLabel(EnglishSRLabeler.java:149)
    at com.clearnlp.component.srl.AbstractSRLabeler.label(AbstractSRLabeler.java:335)
    at com.clearnlp.component.srl.AbstractSRLabeler.processAux(AbstractSRLabeler.java:255)
    at com.clearnlp.component.srl.AbstractSRLabeler.process(AbstractSRLabeler.java:230)
    at de.tudarmstadt.ukp.dkpro.core.clearnlp.ClearNlpSemanticRoleLabeler.process(ClearNlpSemanticRoleLabeler.java:315)
    at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
    at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:385)
    ... 32 more

What version of the product are you using? On what operating system?
dkpro 1.7, MacOS 10.10.1

Please provide any additional information below.

Original issue reported on code.google.com by nghiemt...@gmail.com on 4 Dec 2014 at 3:52

GoogleCodeExporter commented 9 years ago
You have to add a POS-tagger to your pipeline for the SRL to work. The input 
capabilities of ClearNlpSemanticRoleLabeler declare:

  "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Sentence",
  "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Token",
  "de.tudarmstadt.ukp.dkpro.core.api.lexmorph.type.pos.POS",
  "de.tudarmstadt.ukp.dkpro.core.api.segmentation.type.Lemma",
  "de.tudarmstadt.ukp.dkpro.core.api.syntax.type.dependency.Dependency"

The following test case works nice (paste into ClearNlpSemanticRoleLabelerTest 
and run with -Xmx3300):

    @Test
    public void testEnglish2()
        throws Exception
    {
        Assume.assumeTrue(Runtime.getRuntime().maxMemory() > 3000000000l);

        JCas jcas = runTest("en", null,
                "The Wasilla Bible Church in Wasilla, Alaska was apparently attacked by an "
                + "arsonist last night.");

        String[] predicates = new String[] {
                "attacked (attack.01): [(A1:Church)(AM-ADV:apparently)(A0:by)(AM-TMP:night.)]" };

        AssertAnnotations.assertSemanticPredicates(predicates,
                select(jcas, SemanticPredicate.class));
    }

Original comment by richard.eckart on 4 Dec 2014 at 4:31

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 4 Dec 2014 at 4:31

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Hi Richard,

Actually I forgot mentioning the postagger (otherwise, the parser couldn't run 
as well)

I've tried to follow the same way mentioned in the test class, but it's still 
failed here with the same exception I mentioned. Here is my pipeline:

AnalysisEngineDescription engine = createAggregateDescription(
                createPrimitiveDescription(StanfordSegmenter.class),
                createPrimitiveDescription(ClearNlpPosTagger.class),
                createPrimitiveDescription(ClearNlpLemmatizer.class),
                createPrimitiveDescription(ClearNlpDependencyParser.class),
                createPrimitiveDescription(ClearNlpSemanticRoleLabeler.class,
                                ClearNlpDependencyParser.PARAM_VARIANT, null,
                                ClearNlpDependencyParser.PARAM_PRINT_TAGSET, true));

Original comment by nghiemt...@gmail.com on 4 Dec 2014 at 8:52

GoogleCodeExporter commented 9 years ago
I forgot to mention that the pipeline actually worked on many documents in my 
corpus successfully, But this case (the sentence I posted above) is one of the 
cases caused exception.

Original comment by nghiemt...@gmail.com on 4 Dec 2014 at 8:58

GoogleCodeExporter commented 9 years ago
I also cannot reproduce the problem.

The code snipped in comment #4 uses StanfordSegmenter.class instead of 
ClearNlpSegmenter, but that should probably not make much of a difference, but 
might be worth a try.

It also passes the parser parameters into the SRL component, which might not be 
what you want.

Finally, it uses deprecated uimaFit methods. If they are not marked as such for 
you, you might not using the most recent version.

Original comment by torsten....@gmail.com on 4 Dec 2014 at 9:07

GoogleCodeExporter commented 9 years ago
I can reproduce the problem with uimaFIT 2.1.0 and DKPro Core 1.7.0.

The problem is the dependency tree:

1   The the DT  DT  _   4   det _   _
2   Wasilla wasilla NNP NNP _   4   nn  _   _
3   Bible   bible   NNP NNP _   4   nn  _   _
4   Church  church  NNP NNP _   _   _   _   _
5   in  in  IN  IN  _   4   prep    _   _
6   Wasilla wasilla NNP NNP _   5   pobj    _   _
7   ,   ,   ,   ,   _   11  punct   _   _
8   Alaska  alaska  NNP NNP _   11  nsubjpass   _   _
9   was be  VBD VBD _   11  auxpass _   _
10  apparently  apparently  RB  RB  _   11  advmod  _   _
11  attacked    attack  VBN VBN _   _   _   _   _
12  by  by  IN  IN  _   11  agent   _   _
13  an  an  DT  DT  _   14  det _   _
14  arsonist    arsonist    NN  NN  _   12  pobj    _   _
15  last    last    JJ  JJ  _   16  amod    _   _
16  night   night   NN  NN  _   11  npadvmod    _   _
17  .   .   .   .   _   11  punct   _   _

The word "attacked" does not have a head in this tree. However, the SRL code 
requires that verbs have a head:

    /** Called by {@link DEPLibEn#labelReferentOfRelativeClause(DEPNode, List)}. */
    static private DEPNode getTopVerbChain(DEPNode verb)
    {
        while (MPLibEn.isVerb(verb.getHead().pos) && (verb.isLabel(DEP_CONJ) || verb.isLabel(DEP_XCOMP)))
            verb = verb.getHead();

        return verb;
    }

The NPE is caused by the SRL code trying to access the pos of the head of 
"attacked".

Original comment by richard.eckart on 4 Dec 2014 at 9:14

GoogleCodeExporter commented 9 years ago
When running ClearNLP standalone (no UIMA), this is the output:

1   The the DT  _   4   det _
2   Wasilla wasilla NNP _   4   nn  _
3   Bible   bible   NNP _   4   nn  _
4   Church  church  NNP _   0   root    _
5   in  in  IN  _   4   prep    _
6   Wasilla wasilla NNP _   5   pobj    _
7   ,   ,   ,   _   11  punct   _
8   Alaska  alaska  NNP _   11  nsubjpass   11:A1=PPT
9   was be  VBD _   11  auxpass _
10  apparently  apparently  RB  _   11  advmod  11:AM-ADV
11  attacked    attack  VBN pb=attack.01    0   root    _
12  by  by  IN  _   11  agent   11:A0=PAG
13  an  an  DT  _   14  det _
14  arsonist    arsonist    NN  p2=JJ   12  pobj    _
15  last    last    JJ  p2=NN   16  amod    _
16  night   night   NN  _   11  npadvmod    11:AM-TMP
17  .   .   .   _   11  punct   _

Original comment by richard.eckart on 4 Dec 2014 at 9:31

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
The problem is in the ClearNlpSemanticRoleLabeler code. The ClearNLP code 
expects that all root nodes in a dependency tree are attached to a virtual root 
node. But the ClearNlpSemanticRoleLabeler code only attaches the first root not 
to that virtual root node - the others remain without a head. This causes the 
NPE. I'm about to fix this.

Original comment by richard.eckart on 4 Dec 2014 at 9:43

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r3220.

- Attach all root nodes to the virtual root, not only the first one.

Original comment by richard.eckart on 4 Dec 2014 at 9:48

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r3220.

Original comment by richard.eckart on 4 Dec 2014 at 9:48

GoogleCodeExporter commented 9 years ago
This issue was updated by revision r3221.

- Attach all root nodes to the virtual root, not only the first one.

Original comment by richard.eckart on 4 Dec 2014 at 9:52

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r3221.

Original comment by richard.eckart on 4 Dec 2014 at 9:52

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 4 Dec 2014 at 9:53

GoogleCodeExporter commented 9 years ago
@nghiemtrid: I suppose you still remember how to use our snapshots? ;)

Let me also point out that TZ is absolutely right: you should switch to uimaFIT 
2.1.0 - uimaFIT 1.4.0 won't be very good for DKPro Core 1.7.0.

Original comment by richard.eckart on 4 Dec 2014 at 9:55

GoogleCodeExporter commented 9 years ago
:-) Thanks a lot. 

I copied that code from the test class as the last effort to see whether it 
helps, since all of my code using the alternative methods didn't work 

Original comment by nghiemt...@gmail.com on 4 Dec 2014 at 10:06

GoogleCodeExporter commented 9 years ago

Original comment by richard.eckart on 19 Jan 2015 at 10:52