Open acelyc111 opened 4 years ago
(gdb) f 1
#1 0x000000000101f52b in doris::Tuple::deep_copy (this=<optimized out>, desc=..., data=data@entry=0x7efe86d80280, offset=offset@entry=0x7efe86d8027c, convert_ptrs=convert_ptrs@entry=true) at /home/laiyingchun/ap_doris/be/src/runtime/tuple.cpp:135
135 /home/laiyingchun/ap_doris/be/src/runtime/tuple.cpp: No such file or directory.
(gdb) p *string_v
$2 = {
static MAX_LENGTH = 1073741824,
ptr = 0xffffff40e12409c4 <Address 0xffffff40e12409c4 out of bounds>,
len = 18446744073709551615
}
plan_root=
conjuncts=[] id=4 type=ASSERT_NUM_ROWS_NODE tuple_ids=[5, ]
ExchangeNode(#senders=4 conjuncts=[] id=28 type=EXCHANGE_NODE tuple_ids=[6, ])
Then I print the log below. https://github.com/apache/incubator-doris/blob/eefad13107ac74a406212e0f0f57181973ac9c1e/be/src/exec/exec_node.cpp#L340
log in creating ASSERT_NUM_ROWS_NODE:
TPlanNode {
01: node_id (i32) = 4,
02: node_type (i32) = 23,
03: num_children (i32) = 1,
04: limit (i64) = -1,
05: row_tuples (list) = list<i32>[1] {
[0] = 5,
},
06: nullable_tuples (list) = list<bool>[1] {
[0] = false,
},
08: compact_data (bool) = false,
32: assert_num_rows_node (struct) = TAssertNumRowsNode {
01: desired_num_rows (i64) = 1,
02: subquery_string (string) = "SELECT avg(`ss_net_profit`) AS `rank_col` FROM `default_cluster:tpcds`.`store_sales` WHERE (`ss_store_sk` = 4) AND (`ss_addr_sk` IS NULL) GROUP BY `ss_store_sk`",
},
}
log in createing the child ExchangeNode:
TPlanNode {
01: node_id (i32) = 28,
02: node_type (i32) = 9,
03: num_children (i32) = 0,
04: limit (i64) = -1,
05: row_tuples (list) = list<i32>[1] {
[0] = 6,
},
06: nullable_tuples (list) = list<bool>[1] {
[0] = false,
},
08: compact_data (bool) = false,
15: exchange_node (struct) = TExchangeNode {
01: input_row_tuples (list) = list<i32>[1] {
[0] = 6,
},
},
}
The AssertNumRowsNode get batch from its child, but its row_tuples id is 6. Then it has a sink node, the sink node use the same row_tuples id as root plan node(AssertNumRowsNode), so it is 5.
Thus, the sink node use Tuple(id=5 size=24 slots=[Slot(id=12 type=INT col=-1 offset=4 null=(offset=0 mask=80)), Slot(id=13 type=VARCHAR col=-1 offset=8 null=(offset=0 mask=40))] has_varlen_slots=1)
to parse data created in Tuple(id=6 size=24 slots=[Slot(id=14 type=INT col=-1 offset=4 null=(offset=0 mask=80)), Slot(id=15 type=DECIMALV2(9, 0) col=-1 offset=8 null=(offset=0 mask=40))] has_varlen_slots=0)
Describe the bug BE crashed when run TPC-DS test.
To Reproduce Steps to reproduce the behavior:
SELECT asceding.rnk, i1.i_product_name best_performing, i2.i_product_name worst_performing FROM (SELECT * FROM (SELECT item_sk, rank() OVER ( ORDER BY rank_col ASC) rnk FROM (SELECT ss_item_sk item_sk, avg(ss_net_profit) rank_col FROM store_sales ss1 WHERE ss_store_sk = 4 GROUP BY ss_item_sk HAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col FROM store_sales WHERE ss_store_sk = 4 AND ss_addr_sk IS NULL GROUP BY ss_store_sk)) V1) V11 WHERE rnk < 11) asceding, (SELECT * FROM (SELECT item_sk, rank() OVER ( ORDER BY rank_col DESC) rnk FROM (SELECT ss_item_sk item_sk, avg(ss_net_profit) rank_col FROM store_sales ss1 WHERE ss_store_sk = 4 GROUP BY ss_item_sk HAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col FROM store_sales WHERE ss_store_sk = 4 AND ss_addr_sk IS NULL GROUP BY ss_store_sk)) V2) V21 WHERE rnk < 11) descending, item i1, item i2 WHERE asceding.rnk = descending.rnk AND i1.i_item_sk = asceding.item_sk AND i2.i_item_sk = descending.item_sk ORDER BY asceding.rnk LIMIT 100;
Expected behavior SQL finished and return result normally, no BE crash.
Screenshots backtrace:
Desktop (please complete the following information):