antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.12k stars 3.28k forks source link

[c++ runtime] reportNoViableAlternative() causing BAD_ACCESS #2550

Open pnck opened 5 years ago

pnck commented 5 years ago

CrashGrammar.g4:

grammar CrashGrammar;

// combined from https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4  parse + sql_stmt_list
parse: (';'* stmt ( ';'+ stmt )* ';'*)* EOF 
;

stmt: 'A' 'B'?
;

WS: [\u000B\t\r\n ]+  -> skip
;

ANY: .
;

main.cpp:

#include <iostream>

#include "antlr4-runtime.h"
#include "CrashGrammarLexer.h"
#include "CrashGrammarParser.h"

using namespace antlr_sdl;

int main(int argc, char **argv) {
    std::stringstream ss(R"__(
        A B ;
        A _ ;
    )__");
    antlr4::ANTLRInputStream antlr_input(ss);
    CrashGrammarLexer lexer(&antlr_input);
    antlr4::CommonTokenStream token_stream(&lexer);
    CrashGrammarParser parser(&token_stream);
    antlr4::tree::ParseTree *ast = parser.parse();
    std::cout << ast->toStringTree(&parser) << std::endl;
    return 0;
}

Crashes on parser doing parse(). Backtrace shows the NoViableAltException destructing on reportNoViableAlternative return, which is attempting to delete _deadEndConfigs that has been deleted

pnck commented 5 years ago

i also found that if grammar contains ambiguity, the generated parser will not break out of current parsing loop after handled the error, and will try accessing invalid objects which may had been released in a former exception handler.

is it by design?

chund commented 5 years ago

this probably relates to the change in PR #2398 and also to my comments there: https://github.com/antlr/antlr4/commit/95fecce8edfeedf4444e107eaa4f173054a436b9#commitcomment-31893994

But I were not sure how to fix.

chrisaycock commented 5 years ago

I was able to get your code sample working. I commented on your diff; if I remove the contents of the destructor, then your sample works. Just let the shared_ptr handle everything itself.

mike-lischke commented 5 years ago

A shared_ptr won't solve the issue, because the stored dead end configuration can be one of states that are managed in the DFA (also via a shared_ptr), while others might just be created in this handler code. That's what makes this such a tricky problem. Maybe defining a move constructor, which resets the _deleteConfigs in the source exception object would help?

chrisaycock commented 5 years ago

@mike-lischke Thanks for taking a look at this. A move constructor may be the answer.

chrisaycock commented 5 years ago

By the way, I just tried adding a move constructor:

NoViableAltException(NoViableAltException&&) = default;

The ANTLR runtime failed to compile because std::make_exception_ptr() is specified to always make a copy. (See 17.9.6 of the WD standard).

The move constructor was a good idea, but it looks like it isn't allowed.

pnck commented 5 years ago

yes, i closed that PR also for another reason that i found crash may happen even let shared_ptr to handle the pointer. if the grammar definition includes some ambiguities and the parsing token is leading to multi errors, some objects would be use after freed because of the parse loop not breaking out. but i failed to simplify that example and it's related to my personal job so i didn't post it. i may try again to see if i can make another example.

pnck commented 5 years ago

here is the example, please check it

parse: stmt_list* EOF
;

stmt_list: ';'* stmt ( ';'+ stmt )* ';'*
;

stmt:  'A' 'B' |  'A' ('B')?;

WS: [\n ]+  -> skip
;

ANY: .
;

with input

A;
A _B;

run with valgrind and it will warn Invalid Write

pnck commented 5 years ago

please take a review at #2562

i think the real reason is deleting a raw ATNConfigSet pointer which was originally managed by an unique_ptr in DFAState

michalbali256 commented 5 years ago

2501