StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
8.91k stars 1.79k forks source link

BE crash when BinaryColumn overflow #5456

Closed wangruin closed 2 years ago

wangruin commented 2 years ago

Steps to reproduce the behavior (Required)

  1. CREATE TABLE tpcds_100g_inventory ( inv_date_sk int(11) NOT NULL COMMENT "", inv_item_sk int(11) NOT NULL COMMENT "", inv_warehouse_sk int(11) NOT NULL COMMENT "", inv_quantity_on_hand int(11) NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(inv_date_sk, inv_item_sk, inv_warehouse_sk) COMMENT "OLAP" DISTRIBUTED BY HASH(inv_date_sk, inv_item_sk, inv_warehouse_sk) BUCKETS 5 PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "DEFAULT" );
  2. CREATE TABLE tpch_100g_partsupp ( PS_PARTKEY int(11) NOT NULL COMMENT "", PS_SUPPKEY int(11) NOT NULL COMMENT "", PS_AVAILQTY int(11) NOT NULL COMMENT "", PS_SUPPLYCOST decimal64(15, 2) NOT NULL COMMENT "", PS_COMMENT varchar(199) NOT NULL COMMENT "" ) ENGINE=OLAP DUPLICATE KEY(PS_PARTKEY) COMMENT "OLAP" DISTRIBUTED BY HASH(PS_PARTKEY) BUCKETS 12 PROPERTIES ( "replication_num" = "1", "in_memory" = "false", "storage_format" = "DEFAULT" );
  3. select ref_1.PS_SUPPKEY as c0, case when ref_0.inv_quantity_on_hand < ref_1.PS_AVAILQTY then ref_0.inv_date_sk else ref_0.inv_date_sk end as c1, ref_0.inv_warehouse_sk as c2, 18 as c3, cast( nullif(ref_1.PS_COMMENT, ref_1.PS_COMMENT) as VARCHAR ) as c4, 71 as c5, ref_0.inv_warehouse_sk as c6, case when ref_1.PS_SUPPKEY < ref_0.inv_date_sk then max( cast( cast( nullif(ref_1.PS_AVAILQTY, ref_0.inv_date_sk) as INT ) as INT ) ) over ( partition by ref_0.inv_date_sk order by ref_1.PS_COMMENT ) else max( cast( cast( nullif(ref_1.PS_AVAILQTY, ref_0.inv_date_sk) as INT ) as INT ) ) over ( partition by ref_0.inv_date_sk order by ref_1.PS_COMMENT ) end as c7, pi() as c8, ref_0.inv_warehouse_sk as c9, cast( coalesce(ref_1.PS_PARTKEY, ref_0.inv_warehouse_sk) as INT ) as c10 from tpcds_100g_inventory as ref_0 inner join tpch_100g_partsupp as ref_1 on (ref_0.inv_quantity_on_hand = ref_1.PS_PARTKEY) where ref_1.PS_SUPPKEY >= ref_0.inv_item_sk limit 158;

Expected behavior (Required)

Query returns correctly

Real behavior (Required)

tcmalloc: large alloc 2133565440 bytes == 0x40c5c8000 @  0x554049b 0x57dfed6 0x57dbcb3 0x1f8854a 0x572e075 0x19f0f14 0x2396d22 0x2614479 0x26d8b18 0x26eb59d 0x26e1fce 0x20297c9 0x202537a 0x7f63bc9c2e65
tcmalloc: large alloc 1566801920 bytes == 0x48b882000 @  0x554049b 0x57dfed6 0x57dbcb3 0x1f8854a 0x572e075 0x19f0f14 0x2396d22 0x2614479 0x26d8b18 0x26eb59d 0x26e1fce 0x20297c9 0x202537a 0x7f63bc9c2e65
*** Aborted at 1650786530 (unix time) try "date -d @1650786530" if you are using GNU date ***
PC: @     0x7f63bc049fdb __memcmp_sse4_1
*** SIGSEGV (@0x0) received by PID 8967 (TID 0x7f6356737700) from PID 0; stack trace: ***
    @     0x7f63bc9ca5f0 (unknown)
    @     0x7f63bc049fdb __memcmp_sse4_1
    @          0x260ec30 _ZN14pdqsort_detail12pdqsort_loopIN9__gnu_cxx17__normal_iteratorIPN9starrocks10vectorized10SortHelper8SortItemINS3_5SliceEEESt6vectorIS8_SaIS8_EEEEZNS5_30sort_on_not_null_binary_columnILb0EEENS3_6StatusEPNS3_12RuntimeStateEPNS4_6ColumnEbRSA_INS4_15PermutationItemESaISK_EEmmEUlRKS8_SP_E_Lb0EEEvRKbT_ST_T0_ib.isra.0
    @          0x262add8 starrocks::vectorized::SortHelper::sort_on_not_null_binary_column<>()
    @          0x2624f9f starrocks::vectorized::ChunksSorterFullSort::_sort_by_columns()
    @          0x2626dc1 starrocks::vectorized::ChunksSorterFullSort::_sort_chunks()
    @          0x2626f07 starrocks::vectorized::ChunksSorterFullSort::done()
    @          0x25c8cf0 starrocks::vectorized::ChunksSorter::finish()
    @          0x26d7793 starrocks::pipeline::PartitionSortSinkOperator::set_finishing()
    @          0x26e9f87 starrocks::pipeline::PipelineDriver::_mark_operator_finishing()
    @          0x26ea029 starrocks::pipeline::PipelineDriver::_mark_operator_finished()
    @          0x26ea559 starrocks::pipeline::PipelineDriver::_mark_operator_cancelled()
    @          0x26ea84a starrocks::pipeline::PipelineDriver::cancel_operators()
    @          0x26e2189 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
    @          0x20297c9 starrocks::ThreadPool::dispatch_thread()
    @          0x202537a starrocks::Thread::supervise_thread()
    @     0x7f63bc9c2e65 start_thread
    @     0x7f63bbfdd88d __clone
    @                0x0 (unknown)

StarRocks version (Required)

murphyatwork commented 2 years ago

Root cause is still the BinaryColumn overflow. It should be resolved at version 2.3.