TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode NOT IN (
SELECT l_commitdate
FROM lineitem
) and l_shipmode is not null
and l_commitdate is not null
;
return 0 row
Rewritten sql
SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL
EXCEPT
SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IN (
SELECT l_commitdate
FROM lineitem
)AND l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL;
return 5905 row
Analysis
These two queries are logically equivalent, although they are written differently.
The original query filters rows using the NOT IN subquery, selecting rows where l_shipmode is not in the set of l_commitdate values from the lineitem table, and where both l_shipmode and l_commitdate are not NULL (using the IS NOT NULL condition). The result is all rows where l_shipmode is not in the l_commitdate set and both columns are not NULL.
In the rewritten query, the same logic is achieved using the EXCEPT operator. EXCEPT returns the results from the first query, but excludes rows that are present in the second query. Here, the first query returns all records where l_shipmode and l_commitdate are not NULL, and the second query excludes rows where l_shipmode is in the set of l_commitdate. The final result is the same as the original query.
The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.
BugReport: Issues with EXCEPT function
version
8.3.0
Original sql
return 0 row
Rewritten sql
return 5905 row
Analysis
These two queries are logically equivalent, although they are written differently.
The original query filters rows using the NOT IN subquery, selecting rows where l_shipmode is not in the set of l_commitdate values from the lineitem table, and where both l_shipmode and l_commitdate are not NULL (using the IS NOT NULL condition). The result is all rows where l_shipmode is not in the l_commitdate set and both columns are not NULL.
In the rewritten query, the same logic is achieved using the EXCEPT operator. EXCEPT returns the results from the first query, but excludes rows that are present in the second query. Here, the first query returns all records where l_shipmode and l_commitdate are not NULL, and the second query excludes rows where l_shipmode is in the set of l_commitdate. The final result is the same as the original query.
The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.
How to repeat
The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .