hyrise / sql-parser

SQL Parser for C++. Building C++ object structure from SQL statements.
MIT License
749 stars 243 forks source link

Segmentation fault when retrieving a SELECT DISTINCT TOP statement #226

Open startompiz12 opened 1 year ago

startompiz12 commented 1 year ago

Running the following code results in a segfault:

std::string query = "SELECT DISTINCT TOP 3 value, id FROM TableA";
hsql::SQLParserResult result;
hsql::SQLParser::parse(query, &result);
const hsql::SQLStatement* stmt = result.getStatement(0);

DISTINCT and TOP clauses are handled correctly when used individually, the problem seems to happen only when both are used at the same time. Also not happening when using a LIMIT clause instead of the TOP.

mweisgut commented 1 year ago

Thank you for reporting it. We will look into it.

dey4ss commented 1 year ago

result.getStatement(0) returns a nullptr because the parser does not parse the query. According to the parser rule, the DISTINCT keyword has to be before the TOP n keyword: https://github.com/hyrise/sql-parser/blob/18b9d0877d15b2ba06dfea2946d8c96e37254525/src/parser/bison_parser.y#L823

Thus, the parser errors: syntax error, unexpected TOP (L0:16). When you swap the keywords and write

SELECT TOP 3 DISTINCT value, id FROM TableA

the statement gets parsed correctly.

startompiz12 commented 1 year ago

Hi, thank you for the quick answer.

According to the Microsoft documentation, the TOP clause must be placed after the DISTINCT one (at least for SQL Server and Azure SQL Database). Using SELECT TOP 3 DISTINCT would only work with Azure Synapse Analytics and Parallel Data Warehouse.