antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.11k stars 3.28k forks source link

ParseContext#parse() throws StackOverflowError on tiny stack Android devices #1525

Closed gfx closed 7 years ago

gfx commented 7 years ago

Before submitting an issue to ANTLR, please check off these boxes:


This issue is alike to https://github.com/antlr/antlr4/issues/744, but with some extra information.

I have found this issue on my Android library project Android-Orma, which parses SQLite DDL with an ANTLR4-based SQL parser.

How can I avoid this issue?

new Thread(group, runnable, name, stackSize) does not affect this issue. In fact, stackSize seems to be ignored on Android (at least in older versions like Android 4.x).

Environment

Stacktrace

Copied from https://github.com/gfx/Android-Orma/issues/352#issuecomment-267801384

org.antlr.v4.runtime.misc.ParseCancellationException: SQL is too complex to parse: CREATE TABLE foo (title TEXT DEFAULT (''))
at com.github.gfx.android.orma.migration.sqliteparser.SQLiteParserUtils.parseIntoCreateTableStatement(SQLiteParserUtils.java:64)
at com.github.gfx.android.orma.migration.test.sqliteparser_test.SQLiteParserUtilsTest.testComplexTable(SQLiteParserUtilsTest.java:161)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:511)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
at android.support.test.internal.runner.TestExecutor.execute(TestExecutor.java:59)
at android.support.test.runner.AndroidJUnitRunner.onStart(AndroidJUnitRunner.java:262)
at android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1584)
Caused by: java.lang.StackOverflowError
at java.util.Arrays.equals(Arrays.java:1582)
at org.antlr.v4.runtime.atn.ArrayPredictionContext.equals(ArrayPredictionContext.java:78)
at java.util.LinkedHashMap.get(LinkedHashMap.java:259)
at org.antlr.v4.runtime.misc.DoubleKeyMap.get(DoubleKeyMap.java:38)
at org.antlr.v4.runtime.atn.PredictionContext.mergeArrays(PredictionContext.java:363)
at org.antlr.v4.runtime.atn.PredictionContext.merge(PredictionContext.java:174)
at org.antlr.v4.runtime.atn.ATNConfigSet.add(ATNConfigSet.java:155)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1529)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closure_(ParserATNSimulator.java:1583)
at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513)
(snip)
ericvergnaud commented 7 years ago

HI, One could argue that the issue lies in Android, which ignores stackSize. But more broadly, there is no guarantee that ANTLR will operate under tight resource constraints. There has been some exploration to move from recursive code to loop based algorithm. This may affect the stack, but less likely the heap, which is essential to performance. And there is no ETA. In short, you may want to reconsider the objective of delivering a rather complex parser for tiny devices. Eric Envoyé de mon iPhone

Le 18 déc. 2016 à 12:03, FUJI Goro notifications@github.com a écrit :

Before submitting an issue to ANTLR, please check off these boxes:

I am not submitting a question on how to use ANTLR; instead, go to antlr4-discussion google group or ask at stackoverflow I have done a search of the existing issues to make sure I'm not sending in a duplicate This issue is alike to #744, but with some extra information.

I have found this issue on my Android library project Android-Orma, which parses SQLite DDL with an ANTLR4-based SQL parser.

How can I avoid this issue?

new Thread(group, runnable, name, stackSize) does not affect this issue. In fact, stackSize seems to be ignored on Android (at least in older versions like Android 4.x).

Environment

Android with a tiny stack size (older Android, or Android emulators with heapSize=16MB) The grammar: https://github.com/bkiers/sqlite-parser/blob/master/src/main/antlr4/nl/bigo/sqliteparser/SQLite.g4 The program to parse: CREATE TABLE foo (title TEXT DEFAULT ('')) Stacktrace

Copied from gfx/Android-Orma#352 (comment)

org.antlr.v4.runtime.misc.ParseCancellationException: SQL is too complex to parse: CREATE TABLE foo (title TEXT DEFAULT ('')) at com.github.gfx.android.orma.migration.sqliteparser.SQLiteParserUtils.parseIntoCreateTableStatement(SQLiteParserUtils.java:64) at com.github.gfx.android.orma.migration.test.sqliteparsertest.SQLiteParserUtilsTest.testComplexTable(SQLiteParserUtilsTest.java:161) at java.lang.reflect.Method.invokeNative(Native Method) at java.lang.reflect.Method.invoke(Method.java:511) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at org.junit.runner.JUnitCore.run(JUnitCore.java:115) at android.support.test.internal.runner.TestExecutor.execute(TestExecutor.java:59) at android.support.test.runner.AndroidJUnitRunner.onStart(AndroidJUnitRunner.java:262) at android.app.Instrumentation$InstrumentationThread.run(Instrumentation.java:1584) Caused by: java.lang.StackOverflowError at java.util.Arrays.equals(Arrays.java:1582) at org.antlr.v4.runtime.atn.ArrayPredictionContext.equals(ArrayPredictionContext.java:78) at java.util.LinkedHashMap.get(LinkedHashMap.java:259) at org.antlr.v4.runtime.misc.DoubleKeyMap.get(DoubleKeyMap.java:38) at org.antlr.v4.runtime.atn.PredictionContext.mergeArrays(PredictionContext.java:363) at org.antlr.v4.runtime.atn.PredictionContext.merge(PredictionContext.java:174) at org.antlr.v4.runtime.atn.ATNConfigSet.add(ATNConfigSet.java:155) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1529) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) at org.antlr.v4.runtime.atn.ParserATNSimulator.closure(ParserATNSimulator.java:1583) at org.antlr.v4.runtime.atn.ParserATNSimulator.closureCheckingStopState(ParserATNSimulator.java:1513) (snip) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

mike-lischke commented 7 years ago

@ericvergnaud is this exploration something that can publicly be seen? I've been thinking for a while about changing the highly recursive ATN simulator closure code to a loop based one, as it can exhaust the stack even on a a full desktop machine with increased stack size (e.g. for complex expression rules with left recursion).

ericvergnaud commented 7 years ago

I believe someone submitted a PR a while ago, at least there was some discussion

Envoyé de mon iPhone

Le 20 déc. 2016 à 18:22, Mike Lischke notifications@github.com a écrit :

@ericvergnaud is this exploration something that can publicly be seen? I've been thinking for a while about changing the highly recursive ATN simulator closure code to a loop based one, as it can exhaust the stack even on a a full desktop machine with increased stack size (e.g. for complex expression rules with left recursion).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mike-lischke commented 7 years ago

Hmm, I believe you mean the iterative walker, which meanwhile made it into the runtime.

ericvergnaud commented 7 years ago

No I was actually referring to the discussion in #744, which didn’t go very far.

Le 20 déc. 2016 à 18:54, Mike Lischke notifications@github.com a écrit :

Hmm, I believe you mean the iterative walker, which meanwhile made it into the runtime.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/antlr/antlr4/issues/1525#issuecomment-268214067, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLYJLFSP_xgoM3CR-q8btmf9mAZ7cD7ks5rJ7QAgaJpZM4LQCzT.

parrt commented 7 years ago

Yeah, not sure we can do anything about wanting to parse a complex language on a tiny device.

gfx commented 7 years ago

Hmm, if CREATE TABLE foo (title TEXT DEFAULT ('')) is too complex to parse, I think no language can be processed with ANTLR4.

Anyway, now I got it. Thanks.

parrt commented 7 years ago

@gfx the recursive overflow is actually a function of the deeply nested SQL grammar not the input. A few tokens could easily push recursion into the hundreds of invocations. :)

mike-lischke commented 7 years ago

@parrt Whenever I've seen a stack overflow due to deep recursion it was during ATN execution, never in the parser itself (have frequently seen stack depths of 1000+). The closure code does a roundtrip per ATN state (not per rule as the parser), so it is much more sensitive to the parsed language. By converting this to a loop we will probably get a much better behavior.

parrt commented 7 years ago

Yes, but the ATN is a version of the grammar 1-to-1. Analysis during parsing must traverse the ATN (grammar) to answer prediction questions. The ALL(*) is extremely complex particularly when you consider optimizations. Converting such a recursion process to a loop would be nigh on impossible.