pingcap / tiflash

The analytical engine for TiDB and TiDB Cloud. Try free: https://tidbcloud.com/free-trial
https://docs.pingcap.com/tidb/stable/tiflash-overview
Apache License 2.0
944 stars 410 forks source link

Tiflash (v7.5.1) query on ENUM lead to core dumped #9197

Closed uzuki27 closed 3 months ago

uzuki27 commented 3 months ago

Bug Report

I'm using TiDB v7.5.1 on production, after use DM to replica 1 table from OLTP and then alter replica of TiFlash. When TiFlash replica done, I query and my 2 TiFlash server load is increase dramatically and I must restart Tiflash server to make my service available again. I am try to reproduce its on playground env and reliazed that the query of column with data type ENUM lead to that problem.

1. Minimal reproduce step (Required)

create table t1 (id int PRIMARY KEY, col1 ENUM('OK', 'NOT_OK') ) ;
alter table t1 set replica 1;
insert into t1 VALUES (1, 'OK'), (2,'NOT_OK');
explain select col1 from t1 ;
+-------------------------+----------+--------------+---------------+--------------------------------------+
| id                      | estRows  | task         | access object | operator info                        |
+-------------------------+----------+--------------+---------------+--------------------------------------+
| TableReader_6           | 10000.00 | root         |               | MppVersion: 2, data:ExchangeSender_5 |
| └─ExchangeSender_5      | 10000.00 | mpp[tiflash] |               | ExchangeType: PassThrough            |
|   └─TableFullScan_4     | 10000.00 | mpp[tiflash] | table:t1      | keep order:false, stats:pseudo       |
+-------------------------+----------+--------------+---------------+--------------------------------------+

2. What did you expect to see? (Required)

select col1 from t1 ;
+--------+
| col1   |
+--------+
| OK     |
| NOT_OK |
+--------+

3. What did you see instead (Required)

select col1 from t1 ;
ERROR 1105 (HY000): rpc error: code = Unavailable desc = error reading from server: EOF

and log:

tiflash quit: signal: segmentation fault (core dumped)
[2024/07/08 18:27:02.442 +07:00] [ERROR] [BaseDaemon.cpp:563] ["\n       0x7725d81\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+124935553]\n                \tlibs/libdaemon/src/BaseDaemon.cpp:214\n  0x7f6cd900e630\t<unknown symbol> [libpthread.so.0+63024]\n       0x8f0f7f6\tmemcpy [tiflash+150009846]\n                \tlibs/libmemcpy/memcpy.cpp:26\n       0x1ed170b\tDB::WriteBuffer::write(char const*, unsigned long) [tiflash+32315147]\n                \tdbms/src/IO/WriteBuffer.h:93\n       0x86ed283\tDB::flashColToArrowCol(DB::TiDBColumn&, DB::ColumnWithTypeAndName const&, tipb::FieldType const&, unsigned long, unsigned long) [tiflash+141480579]\n                \tdbms/src/Flash/Coprocessor/ArrowColCodec.cpp:485\n       0x86e9259\tDB::ArrowChunkCodecStream::encode(DB::Block const&, unsigned long, unsigned long) [tiflash+141464153]\n                \tdbms/src/Flash/Coprocessor/ArrowChunkCodec.cpp:44\n       0x1e6a81e\tDB::StreamingDAGResponseWriter<std::__1::shared_ptr<DB::AsyncMPPTunnelSetWriter> >::encodeThenWriteBlocks() [tiflash+31893534]\n                \tdbms/src/Flash/Coprocessor/StreamingDAGResponseWriter.cpp:128\n       0x8951baf\tDB::ExchangeSenderSinkOp::writeImpl(DB::Block&&) [tiflash+143989679]\n                \tdbms/src/Operators/ExchangeSenderSinkOp.cpp:33\n       0x88ce6a3\tDB::SinkOp::write(DB::Block&&) [tiflash+143451811]\n                \tdbms/src/Operators/Operator.cpp:183\n       0x88cb62a\tDB::PipelineExec::executeImpl() [tiflash+143439402]\n                \tdbms/src/Flash/Pipeline/Exec/PipelineExec.cpp:125\n       0x88013e6\tDB::PipelineTaskBase::runExecute() [tiflash+142611430]\n                \tdbms/src/Flash/Pipeline/Schedule/Tasks/PipelineTaskBase.h:72\n       0x88eed5a\tDB::Task::execute() [tiflash+143584602]\n                \tdbms/src/Flash/Pipeline/Schedule/Tasks/Task.cpp:133\n       0x1e78b65\tDB::TaskThreadPool<DB::CPUImpl>::loop(unsigned long) [tiflash+31951717]\n                \tdbms/src/Flash/Pipeline/Schedule/ThreadPool/TaskThreadPool.cpp:59\n       0x1e79286\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (DB::TaskThreadPool<DB::CPUImpl>::*)(unsigned long), DB::TaskThreadPool<DB::CPUImpl>*, unsigned long> >(void*) [tiflash+31953542]\n                \t/usr/local/bin/../include/c++/v1/thread:291\n  0x7f6cd9006ea5\tstart_thread [libpthread.so.0+32421]"] [source=BaseDaemon] [thread_id=94]
[2024/07/08 18:27:04.362 +07:00] [DEBUG] [LocalAdmissionController.cpp:263] ["fetch token from GAC periodically(5sec): acquire_infos: rg: default, acquire_tokens: 0, ru_consumption_delta: 0.001007080078125;, req: requests {\n  resource_group_name: \"default\"\n  consumption_since_last_request {\n    r_r_u: 0.001007080078125\n  }\n  is_tiflash: true\n}\ntarget_request_period_ms: 5000\nclient_unique_id: 2064466988\n. resp: responses {\n  resource_group_name: \"default\"\n}\n"] [source=LocalAdmissionController] [thread_id=50]
[2024/07/08 18:27:08.410 +07:00] [INFO] [TiDBSchemaSyncer.cpp:91] ["Start to sync schemas. current version is: 55 and try to sync schema version to: 57"] [source="keyspace=4294967295"] [thread_id=63]
[2024/07/08 18:27:08.411 +07:00] [DEBUG] [SchemaGetter.cpp:312] ["Get TableInfo from TiKV, table_id=108 {\"id\":108,\"name\":{\"O\":\"t1\",\"L\":\"t1\"},\"charset\":\"utf8mb4\",\"collate\":\"utf8mb4_bin\",\"cols\":[{\"id\":1,\"name\":{\"O\":\"id\",\"L\":\"id\"},\"offset\":0,\"origin_default\":null,\"origin_default_bit\":null,\"default\":null,\"default_bit\":null,\"default_is_expr\":false,\"generated_expr_string\":\"\",\"generated_stored\":false,\"dependences\":null,\"type\":{\"Tp\":3,\"Flag\":4099,\"Flen\":11,\"Decimal\":0,\"Charset\":\"binary\",\"Collate\":\"binary\",\"Elems\":null,\"ElemsIsBinaryLit\":null,\"Array\":false},\"state\":5,\"comment\":\"\",\"hidden\":false,\"change_state_info\":null,\"version\":2},{\"id\":2,\"name\":{\"O\":\"col1\",\"L\":\"col1\"},\"offset\":1,\"origin_default\":null,\"origin_default_bit\":null,\"default\":null,\"default_bit\":null,\"default_is_expr\":false,\"generated_expr_string\":\"\",\"generated_stored\":false,\"dependences\":null,\"type\":{\"Tp\":247,\"Flag\":0,\"Flen\":6,\"Decimal\":0,\"Charset\":\"utf8mb4\",\"Collate\":\"utf8mb4_bin\",\"Elems\":[\"OK\",\"NOT_OK\"],\"ElemsIsBinaryLit\":null,\"Array\":false},\"state\":5,\"comment\":\"\",\"hidden\":false,\"change_state_info\":null,\"version\":2}],\"index_info\":null,\"constraint_info\":null,\"fk_info\":null,\"state\":5,\"pk_is_handle\":true,\"is_common_handle\":false,\"common_handle_version\":0,\"comment\":\"\",\"auto_inc_id\":0,\"auto_id_cache\":0,\"auto_rand_id\":0,\"max_col_id\":2,\"max_idx_id\":0,\"max_fk_id\":0,\"max_cst_id\":0,\"update_timestamp\":451002494680039434,\"ShardRowIDBits\":0,\"max_shard_row_id_bits\":0,\"auto_random_bits\":0,\"auto_random_range_bits\":0,\"pre_split_regions\":0,\"partition\":null,\"compression\":\"\",\"view\":null,\"sequence\":null,\"Lock\":null,\"version\":5,\"tiflash_replica\":{\"Count\":1,\"LocationLabels\":[],\"Available\":true,\"AvailablePartitionIDs\":null},\"is_columnar\":false,\"temp_table_type\":0,\"cache_table_status\":0,\"policy_ref_info\":null,\"stats_options\":null,\"exchange_partition_info\":null,\"ttl_info\":null}"] [thread_id=63]
[2024/07/08 18:27:08.412 +07:00] [DEBUG] [SchemaGetter.cpp:312] ["Get TableInfo from TiKV, table_id=108 {\"id\":108,\"name\":{\"O\":\"t1\",\"L\":\"t1\"},\"charset\":\"utf8mb4\",\"collate\":\"utf8mb4_bin\",\"cols\":[{\"id\":1,\"name\":{\"O\":\"id\",\"L\":\"id\"},\"offset\":0,\"origin_default\":null,\"origin_default_bit\":null,\"default\":null,\"default_bit\":null,\"default_is_expr\":false,\"generated_expr_string\":\"\",\"generated_stored\":false,\"dependences\":null,\"type\":{\"Tp\":3,\"Flag\":4099,\"Flen\":11,\"Decimal\":0,\"Charset\":\"binary\",\"Collate\":\"binary\",\"Elems\":null,\"ElemsIsBinaryLit\":null,\"Array\":false},\"state\":5,\"comment\":\"\",\"hidden\":false,\"change_state_info\":null,\"version\":2},{\"id\":2,\"name\":{\"O\":\"col1\",\"L\":\"col1\"},\"offset\":1,\"origin_default\":null,\"origin_default_bit\":null,\"default\":null,\"default_bit\":null,\"default_is_expr\":false,\"generated_expr_string\":\"\",\"generated_stored\":false,\"dependences\":null,\"type\":{\"Tp\":247,\"Flag\":0,\"Flen\":6,\"Decimal\":0,\"Charset\":\"utf8mb4\",\"Collate\":\"utf8mb4_bin\",\"Elems\":[\"OK\",\"NOT_OK\"],\"ElemsIsBinaryLit\":null,\"Array\":false},\"state\":5,\"comment\":\"\",\"hidden\":false,\"change_state_info\":null,\"version\":2}],\"index_info\":null,\"constraint_info\":null,\"fk_info\":null,\"state\":5,\"pk_is_handle\":true,\"is_common_handle\":false,\"common_handle_version\":0,\"comment\":\"\",\"auto_inc_id\":0,\"auto_id_cache\":0,\"auto_rand_id\":0,\"max_col_id\":2,\"max_idx_id\":0,\"max_fk_id\":0,\"max_cst_id\":0,\"update_timestamp\":451002494680039434,\"ShardRowIDBits\":0,\"max_shard_row_id_bits\":0,\"auto_random_bits\":0,\"auto_random_range_bits\":0,\"pre_split_regions\":0,\"partition\":null,\"compression\":\"\",\"view\":null,\"sequence\":null,\"Lock\":null,\"version\":5,\"tiflash_replica\":{\"Count\":1,\"LocationLabels\":[],\"Available\":true,\"AvailablePartitionIDs\":null},\"is_columnar\":false,\"temp_table_type\":0,\"cache_table_status\":0,\"policy_ref_info\":null,\"stats_options\":null,\"exchange_partition_info\":null,\"ttl_info\":null}"] [thread_id=63]
[2024/07/08 18:27:08.412 +07:00] [INFO] [TiDBSchemaSyncer.cpp:124] ["End sync schema, version has been updated to 57"] [source="keyspace=4294967295"] [thread_id=63]
[2024/07/08 18:27:09.120 +07:00] [DEBUG] [SchemaSyncService.cpp:121] ["add sync schema task for keyspaces done, num_add_tasks=0"] [thread_id=60]
[2024/07/08 18:27:09.120 +07:00] [DEBUG] [SchemaSyncService.cpp:153] ["remove sync schema task for keyspaces done, num_remove_tasks=0"] [thread_id=60]
[2024/07/08 18:27:09.171 +07:00] [DEBUG] [GCManager.cpp:72] ["Start GC with keyspace=4294967295, table_id=4"] [thread_id=71]
[2024/07/08 18:27:09.171 +07:00] [DEBUG] [GCManager.cpp:135] ["End GC and next gc will start with keyspace=4294967295, table_id=4"] [thread_id=71]
...
check detail log from: /root/.tiup/data/UHvxLzQ/tiflash-0/tiflash.log

4. What is your TiFlash version? (Required)

v7.5.1

uzuki27 commented 3 months ago

Sorry my bad, this issue duplicated with #8674 and fixed in v7.5.2, i shall upgrade its.