languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.49k stars 1.4k forks source link

[en] Unexpected testrules/Maven error with a case-sensitive AP in DT_VB_NN #2530

Open MikeUnwalla opened 4 years ago

MikeUnwalla commented 4 years ago

Noted while fixing #2521. I get an unexpected error when I add this AP to DT_VB_NN:

<antipattern case_sensitive="yes">
    <token>Paint</token>
    <token>it</token>
    <token>Black</token>
</antipattern>

I get this unexpected testrules error:

Exception in thread "main" java.lang.AssertionError: English: Did not expect error in:
  Those illustrated are reminiscent of a circus top.
  Analyzed token readings: [/SENT_START*] Those[those/DT*,B-NP-singular]  [ /null*] illustrated[illustrated/NN,E-NP-singular]  [ /null*] are[be/VBP,B-VP]  [ /null*] reminiscent[reminiscent/JJ,B-ADJP]  [ /null*] of[of/IN,B-PP]  [ /null*] a[a/DT,B-NP-singular]  [ /null*] circus[circus/JJ,I-NP-singular]  [ /null*] top[top/NN:UN,E-NP-singular] .[./.*,./SENT_END*,./PCT*,O]
Matching Rule: THIS_NNS[3] from /org/languagetool/rules/en/grammar.xml
        at org.junit.Assert.fail(Assert.java:88)
        at org.languagetool.rules.patterns.PatternRuleTest.testCorrectSentences(PatternRuleTest.java:533)
        at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:302)
        at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:157)
        at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:140)
        at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:638)
Running disambiguator rule tests...
Running disambiguation tests for English...
Exception in thread "main" org.junit.ComparisonFailure: The untouched example (These may be to bring about political change.) for English rule DT_VB_NN[1]:[/DT/exceptions=[/PDT|RB, th[eo]se|each|that|much/null], /VB[PZDN]?|MD, /exceptions=[/JJ, not/null]]:determiner + verb/NN -> NN] was touched! expected:<...P-singular] may[may/[MD,may/]NN:U,B-VP] be[be/VB,...> but was:<...P-singular] may[may/[]NN:U,B-VP] be[be/VB,...>
        at org.junit.Assert.assertEquals(Assert.java:115)
        at org.languagetool.tagging.disambiguation.rules.DisambiguationRuleTest.testDisambiguationRulesFromXML(DisambiguationRuleTest.java:129)
        at org.languagetool.tagging.disambiguation.rules.DisambiguationRuleTest.testDisambiguationRulesFromXML(DisambiguationRuleTest.java:81)
        at org.languagetool.tagging.disambiguation.rules.DisambiguationRuleTest.main(DisambiguationRuleTest.java:243)
Running XML bitext pattern tests...
Tests successful.
Validating false-friends.xml...
Validation successfully finished.

I tried with LT 4.8, and got a similar result.

The unexpected error does not occur if I remove case_sensitive="yes" from the AP.

One other disambiguation rule (STATUS_QUO) uses <antipattern case_sensitive="yes">, and that rule works fine.

MikeUnwalla commented 4 years ago

Update: I put the AP into THAT_MD_MD and got this:

Tests finished!
Running disambiguator rule tests...
Running disambiguation tests for English...
Exception in thread "main" java.lang.AssertionError: Disambiguated sentence is equal to the input sentence for rule: THAT_MD_MD. The sentence was: <S> That[that/DT,that/RP,that/WDT,that/WP,B-NP-singular|E-NP-singular] might[might/MD,might/NN:U,B-VP] be[be/VB,I-VP] the[the/DT,B-NP-singular] most[most/NN:U,much/JJS,much/RBS,I-NP-singular] painful[painful/JJ,I-NP-singular] experience[experience/NN:U,experience/VB,experience/VBP,E-NP-singular] in[in/IN,B-PP] my[my/PRP$,B-NP-singular] life[life/NN:UN,E-NP-singular].[./.,</S>./PCT,O]
        at org.junit.Assert.fail(Assert.java:88)
danielnaber commented 4 years ago

Does this still happen? When I add the antipattern to DT_VB_NN, I don't get any error. Is there anything else I need to do to reproduce this?

MikeUnwalla commented 4 years ago

@danielnaber , yes, the error still occurs.

There is nothing else to do. Just add the AP.

For DT_VB_NN, I don't get an error message now.

For THAT_MD_MD, I get an error message. (Using snapshot 2020-03-09.)

danielnaber commented 4 years ago

Strange, I still cannot reproduce with current master. If it still occurs in a few days, please add the code (but comment it out), so that I don't miss anything when trying to reproduce.

MikeUnwalla commented 4 years ago

I synced a few minutes ago. I put the AP in THAT_MD_MD in my GitHub, and got this Maven message:

[ERROR] Failures:
[ERROR]   EnglishTest.testLanguage:47->LanguageSpecificTest.runTests:40->LanguageSpecificTest.runTests:52 Disambiguated sentence is equal to the input sentence for rule: THAT_MD_MD. The sentence was: <S> That[that/DT,that/RP,that/WDT,that/WP,B-NP-singular|E-NP-singular] might[might/MD,might/NN:U,B-VP] be[be/VB,I-VP] the[the/DT,B-NP-singular] most[most/NN:U,much/JJS,much/RBS,I-NP-singular] painful[painful/JJ,I-NP-singular] experience[experience/NN:U,experience/VB,experience/VBP,E-NP-singular] in[in/IN,B-PP] my[my/PRP$,B-NP-singular] life[life/NN:UN,E-NP-singular].[./.,</S>./PCT,O]
[INFO]
[ERROR] Tests run: 92, Failures: 1, Errors: 0, Skipped: 6
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for languagetool-parent 4.9-SNAPSHOT:
[INFO]
[INFO] languagetool-parent ................................ SUCCESS [  1.736 s]
[INFO] LanguageTool Style and Grammar Checker Core ........ SUCCESS [ 52.417 s]
[INFO] English module for LanguageTool .................... FAILURE [01:26 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
danielnaber commented 4 years ago

What command exactly do you run, testrules.bat en?

MikeUnwalla commented 4 years ago

I ran: >mvn --projects languagetool-language-modules/en --also-make clean test

The Maven result is in THAT_MD_MD_with_AP.txt