hyrise / sql-parser

SQL Parser for C++. Building C++ object structure from SQL statements.
MIT License
743 stars 243 forks source link

UTF-8 support #36

Open occash opened 7 years ago

occash commented 7 years ago

It is not possible to pass non ascii identifiers right now. It would be great to support utf-8 queries.

Flex can process utf-8 characters with regex like this

%option 8bit

([A-Za-z0-9_]|[\xc2-\xdf][\x80-\xbf]|[\xe0-\xef][\x80-\xbf][\x80-\xbf]|[\xf0-\xf4][\x80-\xbf][\x80-\xbf][\x80-\xbf])* {
    yylval->sval = strdup(yytext);
    return SQL_IDENTIFIER;
}
torpedro commented 7 years ago

Good observation! Thanks! Ultimately we should add UTF-8 support. Right now we are focusing on different efforts though. If you have something working already, feel free to open a PR and we can work on integrating it into the master asap.

Thanks!