antlr / antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.
http://antlr.org
BSD 3-Clause "New" or "Revised" License
17.11k stars 3.28k forks source link

ANTLRInputStream does not accept string with special character in antlr4 for cpp #2036

Open greatwall1995 opened 7 years ago

greatwall1995 commented 7 years ago

After I run the code like this:

#include "antlr4-runtime.h"
#include "builderListener.h"

using namespace antlrcpptest;
using namespace antlr4;

int main(int , const char **) {
    ANTLRInputStream input("…");
    return 0;
}

where … is a single character whose ASCII code is 133 instead of three dots. I will receive error messages:

terminate called after throwing an instance of 'std::range_error'
  what():  wstring_convert::from_bytes
Aborted (core dumped)

However, if I replace '…' with 'a', the code will run successfully.

SeeSoftware commented 4 years ago

How has no one picked up on this in 2+ years? Im experiencing this bug right now and its annoying.

pfalson commented 4 years ago

This is blocking my migration from antlr 2. We use xFD as a delimiter. Escaping it will require a lot of reworking.