antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.17k stars 3.28k forks source link

Cpp runtime BailErrorStrategy does not stop parsing of wrong grammar. #1887

Open diatech opened 7 years ago

diatech commented 7 years ago

Using the same sample in readme.txt

int main(int argc, const char* argv[]) {
  std::ifstream stream;
  stream.open(argv[1]);
  ANTLRInputStream input(stream);
  MyGrammarLexer lexer(&input);
  CommonTokenStream tokens(&lexer);
  MyGrammarParser parser(&tokens);

  // Adding BailErrorStrategy and hoping to stop the parsing
  parser.setErrorHandler(make_shared<BailErrorStrategy>());

  Ref<tree::ParseTree> tree = parser.key();
  Ref<TreeShapeListener> listener(new TreeShapeListener());
  tree::ParseTreeWalker::DEFAULT.walk(listener, tree);

  return 0;
}

I paste in a wrong grammar, but the parser does not "bail". Am I using the handler correctly? I hope to see more document for Cpp runtime.

Thanks!

mike-lischke commented 7 years ago

How do you know it doesn't bail out? The bail out error strategy throws a specific exception which is not catched by the usual recognition error logic, which is how it stops the recognition process without an attempt to recover. Usually you catch the thrown cancellation error and restart the parsing process in full LL prediction mode (and use SLL for bail out). See this code example (unfortunately written in Typescript, I have no C++ example online).

diatech commented 7 years ago

Thanks Mike,

My solution end up just having a customer BaseErrorListener to throw a std exception instead. Something like:

// G4 Error listener that will bail on the first syntax error
class MyParserErrorListener: public BaseErrorListener
{
    virtual void syntaxError(Recognizer *recognizer, Token * offendingSymbol, size_t line, size_t charPositionInLine,
                             const string &msg, std::exception_ptr e) override
    {
        ostrstream s;
        s << "Line(" << line << ":" << charPositionInLine << ") Error(" << msg << ")";
        throw std::invalid_argument(s.str());
    }
};

....
class MySpecParser
{
    std::shared_ptr<MySpecParserBuilder> builder;
public:
    int addSpecSrc(std::string inputPath);
}

int MySpecParser::addSpecSrc(string inputFn)
{
    std::ifstream stream;
    stream.open(inputFn.c_str());
    if (!stream.is_open())
    {
        cout << "Can not read file(" << inputFn << ") " <<  std::endl;
        return -1;
    }

    // Lexical analysis of the input and create parser
    ANTLRInputStream input(stream);
    MyLexer lexer(&input);
    CommonTokenStream tokens(&lexer);
    MyParser parser(&tokens);

    // Adding error handler for parser
    parser.removeErrorListeners();
    MyParserErrorListener errorListner;
    parser.addErrorListener(&errorListner);

    try
    {
        // Create parse tree from tokens and walk it
        auto tree = parser.root();
        builder = std::make_shared<MySpecParserBuilder>();
        tree::ParseTreeWalker::DEFAULT.walk(builder.get(), tree);
    }
    catch (std::invalid_argument &e)
    {
        std::size_t found = inputFn.find_last_of("/\\");
        cout << "File(" << inputFn.substr(found+1) << ") " <<  e.what() << std::endl;
        return -1;
    }
    return 0;
}