pingcap / docs

TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://docs.pingcap.com
Other
588 stars 683 forks source link

BugReport: Issues with Exists function #19476

Open LLuopeiqi opened 1 day ago

LLuopeiqi commented 1 day ago

BugReport: Issues with Exists function

version

8.3.0

Original sql

SELECT l_extendedprice
FROM lineitem
WHERE (l_comment IN ( COALESCE(1198529099, lineitem.l_discount) )  ) 
    = 0.05420610582877583
GROUP BY l_extendedprice;

return 0 row

Rewritten sql

SELECT l_extendedprice
FROM lineitem
WHERE NOT EXISTS (
    SELECT 1
    WHERE l_comment <> 1198529099 OR l_comment IS NULL
) = 0.05420610582877583
GROUP BY l_extendedprice;

return 4467 row

Analysis

These two queries are logically equivalent, although they are written differently.

The original query uses the COALESCE function to return the first non-NULL value, checking if 1198529099 is NULL (which it is not), and then uses IN to check if l_comment is equal to 1198529099. It then filters the rows by comparing the result of the IN condition with 0.05420610582877583.

The rewritten query uses the NOT EXISTS subquery to check if l_comment is not equal to 1198529099 or if it is NULL. If this condition holds true, the row is excluded. Finally, the result of the NOT EXISTS subquery is compared with 0.05420610582877583 to filter the rows.

The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.

How to repeat

The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .