pingcap / docs

TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://docs.pingcap.com
Other
588 stars 683 forks source link

BugReport: Issues with COALESCE function #19478

Open LLuopeiqi opened 1 day ago

LLuopeiqi commented 1 day ago

BugReport: Issues with COALESCE function

version

8.3.0

Original sql

SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode NOT IN (
    SELECT l_commitdate
    FROM lineitem
    WHERE l_commitdate IS NOT NULL
) AND l_shipmode IS NOT NULL ;

return 0 row

Rewritten sql

SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL
EXCEPT
SELECT DISTINCT l_shipinstruct, l_extendedprice, l_comment
FROM lineitem
WHERE l_shipmode IN (
    SELECT l_commitdate
    FROM lineitem
)AND l_shipmode IS NOT NULL
AND l_commitdate IS NOT NULL;

return 5895 row

Analysis

These two queries are logically equivalent, although they are written differently.

The original query uses a NOT IN subquery to filter rows where l_shipmode is not in the set of l_commitdate (where l_commitdate is not NULL) and where l_shipmode is not NULL. Specifically, the query:

The rewritten query uses an EXCEPT subquery to achieve the same logic:

The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.

How to repeat

The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .