pingcap / docs

TiDB database documentation. TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
https://docs.pingcap.com
Other
588 stars 683 forks source link

Bug Report: Issues with IF function #19475

Open LLuopeiqi opened 1 day ago

LLuopeiqi commented 1 day ago

BugReport: Issues with IF function

version

8.3.0

Original sql

SELECT 
    O_CUSTKEY 
FROM 
    orders
WHERE 
    (
        O_COMMENT   IN (
            IF(
                -1801656545 IS NOT NULL, 
                -1801656545, 

            )
        )
    ) = 0.24036834733741486
GROUP BY 
    O_CUSTKEY ;

return 0 row

Rewritten sql

SELECT  O_CUSTKEY 
FROM orders
WHERE NOT EXISTS (
    SELECT 1
    WHERE O_COMMENT <> -1801656545 OR O_COMMENT IS NULL
) = 0.24036834733741486
GROUP BY  O_CUSTKEY ;

return 101 row

Analysis

These two queries are logically equivalent, although they are written differently.

Original Query: The original query uses the IF function to check if -1801656545 is not NULL. If it is not NULL, it returns -1801656545; otherwise, it would return NULL (though in this specific query, -1801656545 is never NULL). It then uses IN to check if O_COMMENT matches the returned value and compares the result with 0.24036834733741486 to filter rows.

Rewritten Query: In the rewritten query, the NOT EXISTS subquery is used to check if O_COMMENT is not equal to -1801656545 or is NULL, which is logically equivalent to the IN check in the original query. The final result of the NOT EXISTS subquery is then compared with 0.24036834733741486 to filter the rows.

The two SQL queries are logically equivalent, but they return different results, indicating the presence of a bug.

How to repeat

The exported file for the database is in the attachment. : (https://github.com/LLuopeiqi/newtpcd/blob/main/tidb/tpcd.sql) .