Closed yibin87 closed 1 month ago
Full test log: window_incorrect.log
It's a bit corner and hard to reproduce. Lower to major.
This is not a bug, the root cause is the test result is not stable The error query can be simplified as
select lag(r) over w1 as xx, count(s) from window_function_0_part group by r having r > 10 window w1 as (partition by i, s, i order by id, i, r, s, dt);
The plan is
mysql> explain select lag(r) over w1 as xx, count(s) from window_function_0_part group by r having r > 10 window w1 as (partition by i, s, i order by id, i, r, s, dt);
+----------------------------------------------+---------+--------------+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+----------------------------------------------+---------+--------------+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| TableReader_64 | 2.81 | root | partition:all | MppVersion: 2, data:ExchangeSender_63 |
| └─ExchangeSender_63 | 2.81 | mpp[tiflash] | | ExchangeType: PassThrough |
| └─Projection_10 | 2.81 | mpp[tiflash] | | Column#8->Column#9, Column#6->Column#10, stream_count: 8 |
| └─Window_62 | 2.81 | mpp[tiflash] | | lag(test.window_function_0_part.r)->Column#8 over(partition by test.window_function_0_part.i, test.window_function_0_part.s, test.window_function_0_part.i order by test.window_function_0_part.id, test.window_function_0_part.i, test.window_function_0_part.r, test.window_function_0_part.s, test.window_function_0_part.dt), stream_count: 8 |
| └─Sort_31 | 2.81 | mpp[tiflash] | | test.window_function_0_part.i, test.window_function_0_part.s, test.window_function_0_part.i, test.window_function_0_part.id, test.window_function_0_part.i, test.window_function_0_part.r, test.window_function_0_part.s, test.window_function_0_part.dt, stream_count: 8 |
| └─ExchangeReceiver_30 | 2.81 | mpp[tiflash] | | stream_count: 8 |
| └─ExchangeSender_29 | 2.81 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.window_function_0_part.i, collate: binary], [name: test.window_function_0_part.s, collate: utf8mb4_bin], [name: test.window_function_0_part.i, collate: binary], stream_count: 8 |
| └─Projection_25 | 2.81 | mpp[tiflash] | | Column#6, test.window_function_0_part.id, test.window_function_0_part.i, test.window_function_0_part.r, test.window_function_0_part.s, test.window_function_0_part.dt |
| └─HashAgg_26 | 2.81 | mpp[tiflash] | | group by:test.window_function_0_part.r, funcs:sum(Column#11)->Column#6, funcs:firstrow(Column#12)->test.window_function_0_part.id, funcs:firstrow(Column#13)->test.window_function_0_part.i, funcs:firstrow(test.window_function_0_part.r)->test.window_function_0_part.r, funcs:firstrow(Column#15)->test.window_function_0_part.s, funcs:firstrow(Column#16)->test.window_function_0_part.dt, stream_count: 8 |
| └─ExchangeReceiver_28 | 2.81 | mpp[tiflash] | | stream_count: 8 |
| └─ExchangeSender_27 | 2.81 | mpp[tiflash] | | ExchangeType: HashPartition, Compression: FAST, Hash Cols: [name: test.window_function_0_part.r, collate: binary], stream_count: 8 |
| └─HashAgg_14 | 2.81 | mpp[tiflash] | | group by:test.window_function_0_part.r, funcs:count(test.window_function_0_part.s)->Column#11, funcs:firstrow(test.window_function_0_part.id)->Column#12, funcs:firstrow(test.window_function_0_part.i)->Column#13, funcs:firstrow(test.window_function_0_part.s)->Column#15, funcs:firstrow(test.window_function_0_part.dt)->Column#16 |
| └─TableFullScan_23 | 8100.00 | mpp[tiflash] | table:window_function_0_part | pushed down filter:gt(test.window_function_0_part.r, 10), keep order:false, PartitionTableScan:true |
+----------------------------------------------+---------+--------------+------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)
Note that in hashagg_14:
group by:test.window_function_0_part.r, funcs:count(test.window_function_0_part.s)->Column#11, funcs:firstrow(test.window_function_0_part.id)->Column#12, funcs:firstrow(test.window_function_0_part.i)->Column#13, funcs:firstrow(test.window_function_0_part.s)->Column#15, funcs:firstrow(test.window_function_0_part.dt)->Column#16
All the columns like id/i/s/dt
is not in the group by clause, which means the result of id/i/s/dt
is actually random after hashagg_14.
The following query shows this randomness clearly.
mysql> select r,count(s), id, i, s, dt from window_function_0_part where r > 10 group by r;
+------------+----------+-----+------+------+---------------------+
| r | count(s) | id | i | s | dt |
+------------+----------+-----+------+------+---------------------+
| 2147483646 | 2250 | 184 | NULL | NULL | 9999-12-03 00:00:00 |
| 123456.789 | 2250 | 364 | NULL | NULL | 9999-12-03 00:00:00 |
| 2147483647 | 2250 | 149 | NULL | NULL | 9999-12-02 23:59:59 |
+------------+----------+-----+------+------+---------------------+
3 rows in set (0.03 sec)
mysql> select r,count(s), id, i, s, dt from window_function_0_part where r > 10 group by r;
+------------+----------+------+-------------+------+---------------------+
| r | count(s) | id | i | s | dt |
+------------+----------+------+-------------+------+---------------------+
| 123456.789 | 2250 | 5548 | -2147483647 | NULL | 9999-12-03 00:00:00 |
| 2147483646 | 2250 | 5367 | -2147483647 | NULL | 1000-01-02 23:59:59 |
| 2147483647 | 2250 | 5332 | -2147483647 | NULL | 9999-12-03 00:00:00 |
+------------+----------+------+-------------+------+---------------------+
3 rows in set (0.03 sec)
Since the output of hashagg_14 is random, the query result is unstable. Close this bug as Not A Bug.
Test run log: [2024/05/23 16:29:56.352 +08:00] [INFO] [window_function_test.go:511] ["exec sql"] [sql="set @@session.tidb_allow_mpp=on;set @@session.tidb_enforce_mpp=1;set @@tiflash_fine_grained_shuffle_stream_count = 0;set @@tiflash_fine_grained_shuffle_batch_size = 1;select lag(r) over w1, count(s) from window_function_0_part group by r having r > 10 window w1 as (partition by i, s, i order by id, i, r, s, dt) order by 1, 2 limit 71 offset 1;"] [2024/05/23 16:29:56.423 +08:00] [ERROR] [window_function_test.go:165] ["windowFunctionTest error"] [error="result inconsistent tso 449957854275436545, difference: row number: 1, row1: 21474836462250, row2: 21474836472250, tikv: [21474836462250 21474836472250], tiflash: [21474836472250 NULL2250], onlyKV: [], onlyFlash: []"] [stack="github.com/pingcap/endless/testcase/tiflash/window_function.checkErr\n\t/home/jenkins/agent/workspace/endless-master-build/testcase/tiflash/window_function/window_function_test.go:165\ngithub.com/pingcap/endless/testcase/tiflash/window_function.glob..func1.2.2\n\t/home/jenkins/agent/workspace/endless-master-build/testcase/tiflash/window_function/window_function_test.go:70\ngithub.com/onsi/ginkgo/internal/leafnodes.(runner).runSync\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/runner.go:113\ngithub.com/onsi/ginkgo/internal/leafnodes.(runner).run\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/runner.go:64\ngithub.com/onsi/ginkgo/internal/leafnodes.(ItNode).Run\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/leafnodes/it_node.go:26\ngithub.com/onsi/ginkgo/internal/spec.(Spec).runSample\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/spec/spec.go:215\ngithub.com/onsi/ginkgo/internal/spec.(Spec).Run\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/spec/spec.go:138\ngithub.com/onsi/ginkgo/internal/specrunner.(SpecRunner).runSpec\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:200\ngithub.com/onsi/ginkgo/internal/specrunner.(SpecRunner).runSpecs\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:170\ngithub.com/onsi/ginkgo/internal/specrunner.(SpecRunner).Run\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/specrunner/spec_runner.go:66\ngithub.com/onsi/ginkgo/internal/suite.(*Suite).Run\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/internal/suite/suite.go:79\ngithub.com/onsi/ginkgo.runSpecsWithCustomReporters\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/ginkgo_dsl.go:245\ngithub.com/onsi/ginkgo.RunSpecs\n\t/go/pkg/mod/github.com/onsi/ginkgo@v1.16.5/ginkgo_dsl.go:220\ngithub.com/pingcap/endless/testcase/tiflash/window_function.TestWindowFunction\n\t/home/jenkins/agent/workspace/endless-master-build/testcase/tiflash/window_function/window_function_suite_test.go:25\ntesting.tRunner\n\t/usr/local/go/src/testing/testing.go:1446"]