pingcap / tidb

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
https://pingcap.com
Apache License 2.0
37.41k stars 5.85k forks source link

Undetermined results of the SELECT statement with CTE #38170

Closed JZuming closed 2 years ago

JZuming commented 2 years ago

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

Setup the environment:

tiup playground --db.binpath /path/to/latest/tidb-server &
mysql -h "127.0.0.1" -u root -P 4000 -D testdb < mysql_bk.sql

mysql_bk.sql: mysql_bk.txt

Test case


mysql -h "127.0.0.1" -u root -P 4000 -D testdb

mysql> WITH 
cte_0 AS (select distinct 
    ref_0.wkey as c0, 
    ref_0.pkey as c1, 
    ref_0.c_xhsndb as c2
  from 
    t_dnmxh as ref_0
  where (1 <= ( 
      select  
            ref_1.pkey not in (
                      select  
                            ref_5.wkey as c0
                          from 
                            t_dnmxh as ref_5
                          where (ref_5.wkey < ( 
                              select  
                                  ref_6.pkey as c0
                                from 
                                  t_cqmg3b as ref_6
                                where 88 between 96 and 76)) 
                        ) as c0
          from 
            (t_cqmg3b as ref_1
              left outer join t_dnmxh as ref_2
              on (ref_1.wkey = ref_2.wkey ))
          where ref_0.c_xhsndb is NULL
        union
        select  
            33 <= 91 as c0
          from 
            t_cqmg3b as ref_8
          ))), 
cte_1 AS (select  
    ref_9.wkey as c0, 
    ref_9.pkey as c1, 
    ref_9.c_anpf_c as c2, 
    ref_9.c_b_fp_c as c3, 
    ref_9.c_ndccfb as c4, 
    ref_9.c_8rswc as c5
  from 
    t_cqmg3b as ref_9)
select  
    ref_10.c0 as c0, 
    ref_10.c1 as c1, 
    ref_10.c2 as c2
  from 
    cte_0 as ref_10
  where case when 56 < 50 then case when 100 in (
          select distinct 
              ref_11.c4 as c0
            from 
              cte_1 as ref_11
            where (ref_11.c4 > ( 
                  select  
                      ref_13.pkey as c0
                    from 
                      t_dnmxh as ref_13
                    where (ref_13.wkey > ( 
                        select distinct 
                            ref_11.c1 as c0
                          from 
                            cte_0 as ref_14)) 
                      )) 
              or (1 = 1)) then null else null end
         else '7mxv6' end
       not like 'ki4%vc';

2. What did you expect to see? (Required)

Because it does not use functions involving randomness, the output of this statement is determined and will not change in different runs.

3. What did you see instead (Required)

Sometimes it outputs 3 rows

+------+--------+-------+
| c0   | c1     | c2    |
+------+--------+-------+
|  104 | 572000 | 44.37 |
|  106 | 585000 |  NULL |
|  104 | 575000 |  9.53 |
+------+--------+-------+
3 rows in set (0.96 sec)

Sometimes it outputs 5 rows

+------+--------+-------+
| c0   | c1     | c2    |
+------+--------+-------+
|  104 | 577000 | 47.96 |
|  106 | 588000 |  NULL |
|  106 | 586000 |  NULL |
|  108 | 598000 |  NULL |
|  113 | 619000 |  NULL |
+------+--------+-------+
5 rows in set (0.76 sec)

Sometimes it outputs 8 rows

+------+--------+-------+
| c0   | c1     | c2    |
+------+--------+-------+
|  113 | 623000 | 63.81 |
|  113 | 618000 |  92.6 |
|  113 | 622000 |  31.3 |
|  104 | 576000 |  92.4 |
|  113 | 617000 |  NULL |
|  113 | 620000 | 91.65 |
|  106 | 589000 |  NULL |
|  108 | 597000 | 47.51 |
+------+--------+-------+
8 rows in set (0.81 sec)

4. What is your TiDB version? (Required)

Release Version: v6.3.0-20220913
Edition: Community
Git Commit Hash: 95e09ba33c0bdb629c4d4a7a31a2e4bc1212ad2a
Git Branch: HEAD
UTC Build Time: 2022-09-24 12:55:50
GoVersion: go1.19.1
Race Enabled: false
TiKV Min Version: 6.2.0-alpha
Check Table Before Drop: false
Store: tikv
JZuming commented 2 years ago

If I do some small changes in the test case, the output become determined.

For example,

(ref_11.c4 > (
                  select
                      ref_13.pkey as c0
                    from
                      t_dnmxh as ref_13
                    where (ref_13.wkey > (
                        select distinct
                            ref_11.c1 as c0
                          from
                            cte_0 as ref_14))
                      ))
              or (1 = 1)

change to

(1 = 1)

The change is semantic equal because "anything or (1=1)" is "(1=1)".

The test case becomes

WITH 
cte_0 AS (select distinct 
    ref_0.wkey as c0, 
    ref_0.pkey as c1, 
    ref_0.c_xhsndb as c2
  from 
    t_dnmxh as ref_0
  where (1 <= ( 
      select  
            ref_1.pkey not in (
                      select  
                            ref_5.wkey as c0
                          from 
                            t_dnmxh as ref_5
                          where (ref_5.wkey < ( 
                              select  
                                  ref_6.pkey as c0
                                from 
                                  t_cqmg3b as ref_6
                                where 88 between 96 and 76)) 
                        ) as c0
          from 
            (t_cqmg3b as ref_1
              left outer join t_dnmxh as ref_2
              on (ref_1.wkey = ref_2.wkey ))
          where ref_0.c_xhsndb is NULL
        union
        select  
            33 <= 91 as c0
          from 
            t_cqmg3b as ref_8
          ))), 
cte_1 AS (select  
    ref_9.wkey as c0, 
    ref_9.pkey as c1, 
    ref_9.c_anpf_c as c2, 
    ref_9.c_b_fp_c as c3, 
    ref_9.c_ndccfb as c4, 
    ref_9.c_8rswc as c5
  from 
    t_cqmg3b as ref_9)
select  
    ref_10.c0 as c0, 
    ref_10.c1 as c1, 
    ref_10.c2 as c2
  from 
    cte_0 as ref_10
  where case when 56 < 50 then case when 100 in (
          select distinct 
              ref_11.c4 as c0
            from 
              cte_1 as ref_11
            where (1 = 1)) then null else null end
         else '7mxv6' end
       not like 'ki4%vc';

The output becomes stable and determined

+------+--------+--------+
| c0   | c1     | c2     |
+------+--------+--------+
|  106 | 586000 |   NULL |
|  106 | 588000 |   NULL |
|  108 | 598000 |   NULL |
|  104 | 577000 |  47.96 |
|  113 | 619000 |   NULL |
|  104 | 572000 |  44.37 |
|  104 | 575000 |   9.53 |
|  106 | 585000 |   NULL |
|  108 | 595000 |  13.35 |
|  104 | 571000 |   NULL |
|  104 | 574000 |   91.5 |
|  106 | 587000 |   NULL |
|  113 | 621000 | 100.46 |
|  108 | 596000 |  13.51 |
|  113 | 616000 |  24.73 |
|  104 | 573000 |  59.91 |
|  113 | 620000 |  91.65 |
|  113 | 623000 |  63.81 |
|  113 | 617000 |   NULL |
|  113 | 618000 |   92.6 |
|  113 | 622000 |   31.3 |
|  104 | 576000 |   92.4 |
|  106 | 589000 |   NULL |
|  108 | 597000 |  47.51 |
+------+--------+--------+
24 rows in set (0.02 sec)

So, I believe the original test case trigger a bug.