languagetool-org / languagetool

Style and Grammar Checker for 25+ Languages
https://languagetool.org
GNU Lesser General Public License v2.1
12.41k stars 1.39k forks source link

[en ] StackOverflowError if an exception has thousands of words #6237

Open MikeUnwalla opened 2 years ago

MikeUnwalla commented 2 years ago

This [edited] rule causes a stack overflow:

    <rule id="STACK_OVERFLOW_TEST" name="Stack overflow">
        <pattern>
          <token regexp="yes">\w\w+
            <exception regexp="yes" inflected="yes" case_sensitive="yes">aardvark|aback|abacus|abalone|abandon... rent|rental|renumber|renunciation|reopen</exception> 
          </token>
        </pattern>
        <message>Found it.</message>
        <example type="incorrect"><marker>FindMePlease</marker></example>
        <example type="correct"><marker>aardvark</marker></example>
      </rule>

The stack overflow occurs when there are approximately 13,000 words in the exception:

D:\LanguageTool-5.7-SNAPSHOT>testrules en
Running XML pattern tests...
LanguageTool version 5.7-SNAPSHOT (2022-01-10 19:41:10 +0000, 46c2d6c)
Known languages: [Arabic, English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Dutch (Belgium), Portuguese, Portuguese (Portugal), Portuguese (Brazil), Portuguese (Angola preAO), Portuguese (Moçambique preAO), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Irish, Galician, Greek, Japanese, Khmer, Romanian, Slovak, Slovenian, Spanish, Spanish (voseo), Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for en/grammar.xml...
Running pattern rule tests for English (org.languagetool.language.English)...
Exception in thread "main" java.lang.StackOverflowError
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)
        at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Unknown Source)

My solution is to spit the exception into a set of smaller exceptions. But, for the record, I made this issue.

abimael-turing commented 5 months ago

@MikeUnwalla Can I work on this issue? Could you give more details? demo, steps to achieve this error?

MikeUnwalla commented 5 months ago

@abimael-turing, all the information is in the example rule and my comment.

To work on the issue, I think that you can assign yourself.