apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.82k stars 3.3k forks source link

[Bug] How to resolve kafka consume timeout problem of ROUTINE LOAD in doris2.0.1? #27932

Open AlexLWei opened 12 months ago

AlexLWei commented 12 months ago

Search before asking

Version

doris-2.0.1

What's Wrong?

The routine's job submission is fine, but it can't consume data from a specific offset. I can't find out why by checking out routine load job using 'show routine load',The committedTaskNum keeps adding, but the loadedRows is still 0.but I find something like 'kafka consume timeout' in the Be's log. I've tried to create the same job load on another cluster with the same version, and it works just fine.

image

I1204 03:07:23.010603 47523 data_consumer.cpp:132] init kafka consumer with group id: Test2GroupDoris
I1204 03:07:23.478935 48790 routine_load_task_executor.cpp:267] submit a new routine load task: id=f4a2d4dd862245a2-9874e9d03279cef1, job_id=22921599, txn_id=43435323, label=test_task-22921599-f4a2d4dd862245a2-9874e9d03279cef1-43435323, elapse(s)=0, current tasks num: 1
I1204 03:07:23.479050 47524 routine_load_task_executor.cpp:285] begin to execute routine load task: id=f4a2d4dd862245a2-9874e9d03279cef1, job_id=22921599, txn_id=43435323, label=test_task-22921599-f4a2d4dd862245a2-9874e9d03279cef1-43435323, elapse(s)=0
I1204 03:07:23.479334 47524 data_consumer.cpp:132] init kafka consumer with group id: Test2GroupDoris
I1204 03:07:23.480084 47524 data_consumer.cpp:132] init kafka consumer with group id: Test2GroupDoris
I1204 03:07:23.480901 47524 data_consumer.cpp:132] init kafka consumer with group id: Test2GroupDoris
I1204 03:07:23.481585 47524 data_consumer_pool.cpp:102] get consumer group 0d47bbaaa0496c94-1129f84ff8d44c91 with 3 consumers

I1204 03:07:23.481612 47524 data_consumer.cpp:168] consumer: 5241b539e4476586-607ed295c4f1ceb1, grp: 0d47bbaaa0496c94-1129f84ff8d44c91 assign topic partitions: AppPlaySink5, [0: 3669895231] 
I1204 03:07:23.481747 47524 data_consumer.cpp:168] consumer: 7a45f0a4a76ab340-46167cd1b49bdab5, grp: 0d47bbaaa0496c94-1129f84ff8d44c91 assign topic partitions: AppPlaySink5, [1: 3113027716] 
I1204 03:07:23.481930 47524 data_consumer.cpp:168] consumer: 4b4749dffabee51d-c477d899ed6f31b0, grp: 0d47bbaaa0496c94-1129f84ff8d44c91 assign topic partitions: AppPlaySink5, [2: 4683008086] 

I1204 03:07:23.482033 47524 stream_load_executor.cpp:71] begin to execute job. label=test_task-22921599-f4a2d4dd862245a2-9874e9d03279cef1-43435323, txn_id=43435323, query_id=f4a2d4dd862245a2-9874e9d03279cef1
I1204 03:07:23.482151 47524 fragment_mgr.cpp:689] query_id: f4a2d4dd862245a2-9874e9d03279cef1 coord_addr TNetworkAddress(hostname=xxx.xxx.250.170, port=9020) total fragment num on current host: 0
I1204 03:07:23.482177 47524 fragment_mgr.cpp:758] Register query/load memory tracker, query/load id: f4a2d4dd862245a2-9874e9d03279cef1 limit: 2.00 GB
I1204 03:07:23.482211 47524 plan_fragment_executor.cpp:115] PlanFragmentExecutor::prepare|query_id=TUniqueId(hi=-818858134324755038, lo=-7461081602236756239)|instance_id=TUniqueId(hi=-818858134324755038, lo=-7461081602236756238)|backend_num=0|pthread_id=139703869339392
I1204 03:07:23.483139 47524 data_consumer_group.cpp:111] start consumer group: 0d47bbaaa0496c94-1129f84ff8d44c91. max time(ms): 20000, batch rows: 300000, batch size: 209715200. id=f4a2d4dd862245a2-9874e9d03279cef1, job_id=22921599, txn_id=43435323, label=test_task-22921599-f4a2d4dd862245a2-9874e9d03279cef1-43435323, elapse(s)=0
I1204 03:07:23.483232 47484 fragment_mgr.cpp:528] PlanFragmentExecutor::_exec_actual|query_id=f4a2d4dd862245a2-9874e9d03279cef1|instance_id=f4a2d4dd862245a2-9874e9d03279cef2|pthread_id=139704515426048
I1204 03:07:23.483266  2170 data_consumer.cpp:193] start kafka consumer: 5241b539e4476586-607ed295c4f1ceb1, grp: 0d47bbaaa0496c94-1129f84ff8d44c91, max running time(ms): 20000
I1204 03:07:23.483278 47484 plan_fragment_executor.cpp:251] PlanFragmentExecutor::open|query_id=TUniqueId(hi=-818858134324755038, lo=-7461081602236756239)|instance_id=TUniqueId(hi=-818858134324755038, lo=-7461081602236756238)|mem_limit=2.00 GB
I1204 03:07:23.483366  2171 data_consumer.cpp:193] start kafka consumer: 7a45f0a4a76ab340-46167cd1b49bdab5, grp: 0d47bbaaa0496c94-1129f84ff8d44c91, max running time(ms): 20000
I1204 03:07:23.483525  2172 data_consumer.cpp:193] start kafka consumer: 4b4749dffabee51d-c477d899ed6f31b0, grp: 0d47bbaaa0496c94-1129f84ff8d44c91, max running time(ms): 20000
I1204 03:07:23.484398 48364 tablets_channel.cpp:103] open tablets channel: (load_id=f4a2d4dd862245a2-9874e9d03279cef1, index_id=22894415), tablets num: 33, timeout(s): 40
I1204 03:07:24.143123 48127 tablets_channel.cpp:145] close tablets channel: (load_id=73e2b215d9ca4438-b8bbe21c74a6ce2e, index_id=82325), sender id: 0, backend id: 22866632
I1204 03:07:24.143297 48127 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.233.210#loadID=73e2b215d9ca4438-b8bbe21c74a6ce2e; consumption: 0; peak_consumption: 0; , load_id=73e2b215d9ca4438-b8bbe21c74a6ce2e, is high priority=1, sender_ip=xxx.xxx.233.210
I1204 03:07:24.150949 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435314|queue_size=1
I1204 03:07:24.151021 47946 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435314, cost(us): 26, error_tablet_size=0, res=[OK]
I1204 03:07:24.151052 47946 task_worker_pool.cpp:1610] successfully publish version|signature=43435314|transaction_id=43435314|tablets_num=0|cost(s)=0
I1204 03:07:24.205654 48365 tablets_channel.cpp:103] open tablets channel: (load_id=a7c9e64dbdac44d9-bb00415d35693f35, index_id=82325), tablets num: 39, timeout(s): 40
I1204 03:07:24.483422  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:24.483526  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:24.483673  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:25.483566  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:25.483654  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:25.483808  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:26.344619 47980 task_worker_pool.cpp:1068] successfully report TASK|host=xxx.xxx.250.170|port=9020
I1204 03:07:26.483714  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:26.483749  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:26.483943  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:26.527823 47844 compaction.cpp:327] start cumulative compaction. tablet=22826396.1255203342.33418dd0a1e9104a-b094cdf66fbb74b7, output_version=[3-5659], permits: 6
I1204 03:07:26.527940 47844 merger.cpp:341] Start to do vertical compaction, tablet_id: 22826396
I1204 03:07:26.603232 47844 compaction.cpp:514] succeed to do cumulative compaction is_vertical=1. tablet=22826396.1255203342.33418dd0a1e9104a-b094cdf66fbb74b7, output_version=[3-5659], current_max_version=5659, disk=/hdd1/be_storage, segments=6, input_row_num=325, output_row_num=211. elapsed time=0.075498s. cumulative_compaction_policy=size_based, compact_row_per_second=4304

I1204 03:07:27.483865  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:27.483894  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:27.484084  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:27.605733 48147 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.250.170#loadID=b6cba6e61fec424c-b6ae302ac3b4a4a5; consumption: 0; peak_consumption: 0; , load_id=b6cba6e61fec424c-b6ae302ac3b4a4a5, is high priority=1, sender_ip=xxx.xxx.250.170
I1204 03:07:27.615383 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435315|queue_size=1
I1204 03:07:27.638970 48123 load_channel.cpp:46] load channel removed. mem peak usage=4423674, info=label: LoadChannel#senderIp=xxx.xxx.249.170#loadID=4142d1e7326a44b3-8fed976132a8fc79; consumption: 0; peak_consumption: 4423674; , load_id=4142d1e7326a44b3-8fed976132a8fc79, is high priority=1, sender_ip=xxx.xxx.249.170
I1204 03:07:27.645949 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435292|queue_size=1
I1204 03:07:27.672333 47928 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=18085, tablet=22826421.361175689.954844c4274f3b96-f7f7c5377d798aa8, transaction_id=43435315, version=3156, num_rows=3867, res=[OK], cost: 56834(us) 
I1204 03:07:27.672348 47904 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=18085, tablet=22826417.361175689.f14e71164ce37234-f6e6696e63ae6ca6, transaction_id=43435315, version=3156, num_rows=3908, res=[OK], cost: 56852(us) 
I1204 03:07:27.672348 47909 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=18085, tablet=22826413.361175689.374794bb00ef7efd-edb97cdb8846e6a6, transaction_id=43435315, version=3156, num_rows=3821, res=[OK], cost: 56872(us) 
I1204 03:07:27.672446 47945 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435315, cost(us): 57016, error_tablet_size=0, res=[OK]
I1204 03:07:27.672557 47945 task_worker_pool.cpp:1610] successfully publish version|signature=43435315|transaction_id=43435315|tablets_num=3|cost(s)=0
I1204 03:07:27.705737 47914 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=25909, tablet=22827197.1526226747.5a472fb90eeb4b1a-defd2e9389a55196, transaction_id=43435292, version=7934, num_rows=9115, res=[OK], cost: 59681(us) 
I1204 03:07:27.705747 47903 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=25909, tablet=22810777.1526226747.7b46727b5d4e3ee9-a1d5325be31742a2, transaction_id=43435292, version=102709, num_rows=165, res=[OK], cost: 59751(us) 
I1204 03:07:27.705756 47907 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=25909, tablet=22810781.1526226747.79413421591a4b3b-2e8d9873d8bb1c98, transaction_id=43435292, version=102709, num_rows=170, res=[OK], cost: 59754(us) 
I1204 03:07:27.705757 47905 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=25909, tablet=22827193.1526226747.ac4de65166e00939-f9dbe27c93cfb7b7, transaction_id=43435292, version=7934, num_rows=9089, res=[OK], cost: 59706(us) 
I1204 03:07:27.705767 47920 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=25909, tablet=22827189.1526226747.d041c6fa4a2976c9-5adee8602514bba2, transaction_id=43435292, version=7934, num_rows=9132, res=[OK], cost: 59753(us) 
I1204 03:07:27.705852 47944 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435292, cost(us): 59890, error_tablet_size=0, res=[OK]
I1204 03:07:27.705883 47944 task_worker_pool.cpp:1610] successfully publish version|signature=43435292|transaction_id=43435292|tablets_num=5|cost(s)=0
I1204 03:07:28.461325 48140 tablets_channel.cpp:145] close tablets channel: (load_id=6fd9c9f3503d40f1-be7016e1a86528d8, index_id=4494807), sender id: 0, backend id: 22866632
I1204 03:07:28.461553 48140 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.250.170#loadID=6fd9c9f3503d40f1-be7016e1a86528d8; consumption: 0; peak_consumption: 0; , load_id=6fd9c9f3503d40f1-be7016e1a86528d8, is high priority=1, sender_ip=xxx.xxx.250.170
I1204 03:07:28.469347 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435316|queue_size=1
I1204 03:07:28.469411 47948 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435316, cost(us): 18, error_tablet_size=0, res=[OK]
I1204 03:07:28.469441 47948 task_worker_pool.cpp:1610] successfully publish version|signature=43435316|transaction_id=43435316|tablets_num=0|cost(s)=0
I1204 03:07:28.484009  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:28.484032  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:28.484216  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:28.842556 48139 tablets_channel.cpp:145] close tablets channel: (load_id=e6fef543b62646fc-b37d7957b4545d3f, index_id=84287), sender id: 0, backend id: 22866632
I1204 03:07:28.842736 48139 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.249.170#loadID=e6fef543b62646fc-b37d7957b4545d3f; consumption: 0; peak_consumption: 0; , load_id=e6fef543b62646fc-b37d7957b4545d3f, is high priority=1, sender_ip=xxx.xxx.249.170
I1204 03:07:28.851971 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435317|queue_size=1
I1204 03:07:28.852018 47942 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435317, cost(us): 18, error_tablet_size=0, res=[OK]
I1204 03:07:28.852047 47942 task_worker_pool.cpp:1610] successfully publish version|signature=43435317|transaction_id=43435317|tablets_num=0|cost(s)=0
I1204 03:07:28.960469 48366 tablets_channel.cpp:103] open tablets channel: (load_id=0d2ba5366c344617-92f24c1b374b60fb, index_id=84287), tablets num: 46, timeout(s): 40
I1204 03:07:29.117667 47834 storage_engine.cpp:792] remove 64 invalid rowset meta from dir: /hdd1/be_storage
I1204 03:07:29.117836 47834 storage_engine.cpp:792] remove 0 invalid rowset meta from dir: /ssd1/be_storage
I1204 03:07:29.117904 47834 storage_engine.cpp:825] remove 0 invalid binlog meta from dir: /hdd1/be_storage
I1204 03:07:29.117914 47834 storage_engine.cpp:825] remove 0 invalid binlog meta from dir: /ssd1/be_storage
I1204 03:07:29.117933 47834 storage_engine.cpp:850] removed invalid delete bitmap from dir: /hdd1/be_storage, deleted tablets size: 0
I1204 03:07:29.117940 47834 storage_engine.cpp:850] removed invalid delete bitmap from dir: /ssd1/be_storage, deleted tablets size: 0
I1204 03:07:29.117957 47834 storage_engine.cpp:874] removed invalid pending publish info from dir: /hdd1/be_storage, deleted pending publish info size: 0
I1204 03:07:29.117964 47834 storage_engine.cpp:874] removed invalid pending publish info from dir: /ssd1/be_storage, deleted pending publish info size: 0
I1204 03:07:29.126677 47834 data_dir.cpp:820] path: /hdd1/be_storage trash capacity: 6918104429
I1204 03:07:29.126715 47834 data_dir.cpp:820] path: /ssd1/be_storage trash capacity: 0
I1204 03:07:29.484143  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:29.484167  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:29.484344  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:30.228843 48142 tablets_channel.cpp:145] close tablets channel: (load_id=e90baadc45654901-ad7e018d45ae3f40, index_id=4494807), sender id: 0, backend id: 22866632
I1204 03:07:30.228998 48142 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.250.170#loadID=e90baadc45654901-ad7e018d45ae3f40; consumption: 0; peak_consumption: 0; , load_id=e90baadc45654901-ad7e018d45ae3f40, is high priority=1, sender_ip=xxx.xxx.250.170
I1204 03:07:30.235925 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435318|queue_size=1
I1204 03:07:30.235973 47943 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435318, cost(us): 19, error_tablet_size=0, res=[OK]
I1204 03:07:30.236004 47943 task_worker_pool.cpp:1610] successfully publish version|signature=43435318|transaction_id=43435318|tablets_num=0|cost(s)=0
I1204 03:07:30.484272  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:30.484302  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:30.484470  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:30.633963 48367 tablets_channel.cpp:103] open tablets channel: (load_id=d6c65261ef4747c0-b85db1eb4071e976, index_id=4494807), tablets num: 40, timeout(s): 40
I1204 03:07:31.234275 48136 tablets_channel.cpp:145] close tablets channel: (load_id=fba9ccd58b434151-b40d79d6dac04157, index_id=14058), sender id: 0, backend id: 22866632
I1204 03:07:31.234483 48136 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.233.210#loadID=fba9ccd58b434151-b40d79d6dac04157; consumption: 0; peak_consumption: 0; , load_id=fba9ccd58b434151-b40d79d6dac04157, is high priority=1, sender_ip=xxx.xxx.233.210
I1204 03:07:31.243858 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435319|queue_size=1
I1204 03:07:31.243922 47947 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435319, cost(us): 17, error_tablet_size=0, res=[OK]
I1204 03:07:31.243952 47947 task_worker_pool.cpp:1610] successfully publish version|signature=43435319|transaction_id=43435319|tablets_num=0|cost(s)=0
I1204 03:07:31.302433 48368 tablets_channel.cpp:103] open tablets channel: (load_id=7389c8150a6f4ec3-929948d9aa2e9e53, index_id=14058), tablets num: 61, timeout(s): 40
I1204 03:07:31.484402  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:31.484438  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:31.484598  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:32.189793 48145 tablets_channel.cpp:145] close tablets channel: (load_id=ba61f2cbd7354f63-9f5b9eb98ec6e86b, index_id=84233), sender id: 0, backend id: 22866632
I1204 03:07:32.189981 48145 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.250.170#loadID=ba61f2cbd7354f63-9f5b9eb98ec6e86b; consumption: 0; peak_consumption: 0; , load_id=ba61f2cbd7354f63-9f5b9eb98ec6e86b, is high priority=1, sender_ip=xxx.xxx.250.170
I1204 03:07:32.200906 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435320|queue_size=1
I1204 03:07:32.200973 47949 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435320, cost(us): 18, error_tablet_size=0, res=[OK]
I1204 03:07:32.201004 47949 task_worker_pool.cpp:1610] successfully publish version|signature=43435320|transaction_id=43435320|tablets_num=0|cost(s)=0
I1204 03:07:32.259405 48369 tablets_channel.cpp:103] open tablets channel: (load_id=7e6b6cc12f4a4bdc-829bf89b89ad4d67, index_id=84233), tablets num: 50, timeout(s): 40
I1204 03:07:32.484536  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:32.484549  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:32.484727  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:32.649734 47844 compaction.cpp:327] start cumulative compaction. tablet=22826400.1919734962.554571b17fa4e03a-4b72f7bada96e9ad, output_version=[5655-5659], permits: 5
I1204 03:07:32.649813 47844 merger.cpp:341] Start to do vertical compaction, tablet_id: 22826400
I1204 03:07:33.484665  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:33.484692  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:33.484853  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:33.559924 47844 compaction.cpp:514] succeed to do cumulative compaction is_vertical=1. tablet=22826400.1919734962.554571b17fa4e03a-4b72f7bada96e9ad, output_version=[5655-5659], current_max_version=5659, disk=/hdd1/be_storage, segments=5, input_row_num=360645, output_row_num=322693. elapsed time=0.91025s. cumulative_compaction_policy=size_based, compact_row_per_second=396200
I1204 03:07:34.137216 48148 tablets_channel.cpp:145] close tablets channel: (load_id=8ad866e0298c4c54-9a1aa4cf4214ea70, index_id=83257), sender id: 0, backend id: 22866632
I1204 03:07:34.137881 48133 tablets_channel.cpp:145] close tablets channel: (load_id=8ad866e0298c4c54-9a1aa4cf4214ea70, index_id=82600), sender id: 0, backend id: 22866632
I1204 03:07:34.300383 48133 load_channel.cpp:46] load channel removed. mem peak usage=10247120, info=label: LoadChannel#senderIp=xxx.xxx.233.210#loadID=8ad866e0298c4c54-9a1aa4cf4214ea70; consumption: 10247120; peak_consumption: 10247120; , load_id=8ad866e0298c4c54-9a1aa4cf4214ea70, is high priority=1, sender_ip=xxx.xxx.233.210
I1204 03:07:34.409976 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435321|queue_size=1
I1204 03:07:34.442544 47901 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=82599, tablet=22826396.1255203342.33418dd0a1e9104a-b094cdf66fbb74b7, transaction_id=43435321, version=5660, num_rows=16, res=[OK], cost: 32449(us) 
I1204 03:07:34.467495 47923 engine_publish_version_task.cpp:288] publish version successfully on tablet, table_id=82599, tablet=22826400.1919734962.554571b17fa4e03a-4b72f7bada96e9ad, transaction_id=43435321, version=5660, num_rows=21899, res=[OK], cost: 57382(us) 
I1204 03:07:34.467622 47946 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435321, cost(us): 57586, error_tablet_size=0, res=[OK]
I1204 03:07:34.467665 47946 task_worker_pool.cpp:1610] successfully publish version|signature=43435321|transaction_id=43435321|tablets_num=2|cost(s)=0
I1204 03:07:34.484875  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:34.484887  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:34.485023  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:35.485010  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:35.485025  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:35.485149  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:36.485152  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:36.485168  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:36.485278  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:36.839352 47893 olap_server.cpp:1064] cooldown producer get tablet num: 0
I1204 03:07:37.161912 48149 tablets_channel.cpp:145] close tablets channel: (load_id=96d8c5f93fab4cf6-b9262114ee4d789a, index_id=4494807), sender id: 0, backend id: 22866632
I1204 03:07:37.162107 48149 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.249.170#loadID=96d8c5f93fab4cf6-b9262114ee4d789a; consumption: 0; peak_consumption: 0; , load_id=96d8c5f93fab4cf6-b9262114ee4d789a, is high priority=1, sender_ip=xxx.xxx.249.170
I1204 03:07:37.172848 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435322|queue_size=1
I1204 03:07:37.172904 47945 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435322, cost(us): 30, error_tablet_size=0, res=[OK]
I1204 03:07:37.172941 47945 task_worker_pool.cpp:1610] successfully publish version|signature=43435322|transaction_id=43435322|tablets_num=0|cost(s)=0
I1204 03:07:37.239447 48370 tablets_channel.cpp:103] open tablets channel: (load_id=8122b7e4c44f422e-9c009232730e3815, index_id=4494807), tablets num: 40, timeout(s): 40
I1204 03:07:37.485275  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:37.485285  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:37.485420  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:38.485420  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:38.485424  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:38.485558  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:39.485560  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:39.485563  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:39.485687  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:40.345448 47980 task_worker_pool.cpp:1068] successfully report TASK|host=xxx.xxx.250.170|port=9020
I1204 03:07:40.485695  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:40.485713  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:40.485821  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:40.801159 48153 data_consumer.cpp:132] init kafka consumer with group id: AppStartupGroupDoris
I1204 03:07:41.485828  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:41.485850  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:41.485947  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:42.452143 47982 tablet_manager.cpp:1016] find expired transactions for 0 tablets
I1204 03:07:42.459638 47982 tablet_manager.cpp:1048] success to build all report tablets info. tablet_count=2249
I1204 03:07:42.468081 47982 task_worker_pool.cpp:1068] successfully report TABLET|host=xxx.xxx.250.170|port=9020
I1204 03:07:42.485940  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:42.485987  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:42.486076  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:43.486052  2171 data_consumer.cpp:238] kafka consume timeout: 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:43.486078  2171 data_consumer.cpp:261] kafka consumer done: 7a45f0a4a76ab340-46167cd1b49bdab5, grp: 0d47bbaaa0496c94-1129f84ff8d44c91. cancelled: 0, left time(ms): -2, total cost(ms): 20002, consume cost(ms): 20002, received rows: 0, put rows: 0
I1204 03:07:43.486120  2170 data_consumer.cpp:238] kafka consume timeout: 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:43.486138  2170 data_consumer.cpp:261] kafka consumer done: 5241b539e4476586-607ed295c4f1ceb1, grp: 0d47bbaaa0496c94-1129f84ff8d44c91. cancelled: 0, left time(ms): -2, total cost(ms): 20002, consume cost(ms): 20002, received rows: 0, put rows: 0
I1204 03:07:43.486204  2172 data_consumer.cpp:238] kafka consume timeout: 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:43.486223  2172 data_consumer.cpp:261] kafka consumer done: 4b4749dffabee51d-c477d899ed6f31b0, grp: 0d47bbaaa0496c94-1129f84ff8d44c91. cancelled: 0, left time(ms): -2, total cost(ms): 20002, consume cost(ms): 20002, received rows: 0, put rows: 0
I1204 03:07:43.486239  2172 data_consumer_group.cpp:87] all consumers are finished. shutdown queue. group id: 0d47bbaaa0496c94-1129f84ff8d44c91
I1204 03:07:43.486294 47524 data_consumer_group.cpp:131] consumer group done: 0d47bbaaa0496c94-1129f84ff8d44c91. consume time(ms)=20003, received rows=0, received bytes=0, eos: 1, left_time: -3, left_rows: 300000, left_bytes: 209715200, blocking get time(us): 20003120, blocking put time(us): 0, id=f4a2d4dd862245a2-9874e9d03279cef1, job_id=22921599, txn_id=43435323, label=test_task-22921599-f4a2d4dd862245a2-9874e9d03279cef1-43435323, elapse(s)=20
I1204 03:07:43.486337 47524 data_consumer.cpp:404] kafka consumer cancelled. 5241b539e4476586-607ed295c4f1ceb1
I1204 03:07:43.486347 47524 data_consumer.cpp:404] kafka consumer cancelled. 7a45f0a4a76ab340-46167cd1b49bdab5
I1204 03:07:43.486353 47524 data_consumer.cpp:404] kafka consumer cancelled. 4b4749dffabee51d-c477d899ed6f31b0
I1204 03:07:43.487011 47484 vtablet_sink.cpp:931] VNodeChannel[22894415-22876147], load_id=f4a2d4dd862245a2-9874e9d03279cef1, txn_id=43435323, node=xxx.xxx.233.210:8060 mark closed, left pending batch size: 1
I1204 03:07:43.487036 47484 vtablet_sink.cpp:931] VNodeChannel[22894415-22866631], load_id=f4a2d4dd862245a2-9874e9d03279cef1, txn_id=43435323, node=xxx.xxx.250.170:8060 mark closed, left pending batch size: 1
I1204 03:07:43.487064 47484 vtablet_sink.cpp:931] VNodeChannel[22894415-22866630], load_id=f4a2d4dd862245a2-9874e9d03279cef1, txn_id=43435323, node=xxx.xxx.249.170:8060 mark closed, left pending batch size: 1
I1204 03:07:43.487072 47484 vtablet_sink.cpp:931] VNodeChannel[22894415-22866632], load_id=f4a2d4dd862245a2-9874e9d03279cef1, txn_id=43435323, node=xxx.xxx.245.98:8060 mark closed, left pending batch size: 1
I1204 03:07:43.488685 48163 tablets_channel.cpp:145] close tablets channel: (load_id=f4a2d4dd862245a2-9874e9d03279cef1, index_id=22894415), sender id: 0, backend id: 22866632
I1204 03:07:43.488823 48163 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=xxx.xxx.245.98#loadID=f4a2d4dd862245a2-9874e9d03279cef1; consumption: 0; peak_consumption: 0; , load_id=f4a2d4dd862245a2-9874e9d03279cef1, is high priority=1, sender_ip=xxx.xxx.245.98
I1204 03:07:43.489120 48513 vtablet_sink.cpp:1158] all node channels are stopped(maybe finished/offending/cancelled), sender thread exit. f4a2d4dd862245a2-9874e9d03279cef1
I1204 03:07:43.490420 47484 vtablet_sink.cpp:1585] total mem_exceeded_block_ns=0, total queue_push_lock_ns=0, total actual_consume_ns=289036, load id=f4a2d4dd862245a2-9874e9d03279cef1
I1204 03:07:43.490453 47484 vtablet_sink.cpp:1629] finished to close olap table sink. load_id=f4a2d4dd862245a2-9874e9d03279cef1, txn_id=43435323, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {22866632:(0)(0)(3)(1)} {22866630:(0)(0)(3)(1)} {22866631:(0)(0)(3)(1)} {22876147:(0)(0)(3)(1)} 
I1204 03:07:43.490675 47484 exec_node.cpp:200] fragment_instance_id=f4a2d4dd862245a2-9874e9d03279cef2 closed
I1204 03:07:43.490844 47484 plan_fragment_executor.cpp:563] Close() fragment_instance_id=f4a2d4dd862245a2-9874e9d03279cef2
I1204 03:07:43.491314 47484 query_context.h:69] Deregister query/load memory tracker, queryId=f4a2d4dd862245a2-9874e9d03279cef1, Limit=2.00 GB, CurrUsed=-1.85 KB, PeakUsed=25.55 MB
I1204 03:07:43.498273 48790 task_worker_pool.cpp:263] successfully submit task|type=PUBLISH_VERSION|signature=43435323|queue_size=1
I1204 03:07:43.498340 47944 engine_publish_version_task.cpp:232] finish to publish version on transaction.transaction_id=43435323, cost(us): 19, error_tablet_size=0, res=[OK]
I1204 03:07:43.498373 47944 task_worker_pool.cpp:1610] successfully publish version|signature=43435323|transaction_id=43435323|tablets_num=0|cost(s)=0

What You Expected?

how to resolve this issus

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

vinlee19 commented 12 months ago

can you provide the routine load statement ? Is property.group.id the same as another cluster?

AlexLWei commented 12 months ago

can you provide the routine load statement ? Is property.group.id the same as another cluster?

yes ,they are exacly the same ,I use ‘show create routine load' command and copy it to anoter cluster.

` CREATE ROUTINE LOAD test_task ON test WITH APPEND COLUMNS(userId,date,terminalType,bussType,appId,title,category,mCode,from`,duration,playCount) PROPERTIES ( "desired_concurrent_number" = "1", "max_error_number" = "30000", "max_batch_interval" = "20", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "format" = "json", "strip_outer_array" = "false", "num_as_string" = "false", "fuzzy_parse" = "false", "strict_mode" = "false", "timezone" = "+00:00", "exec_mem_limit" = "2147483648" ) FROM KAFKA ( "kafka_broker_list" = "xxxxxxxx.com:8291", "kafka_topic" = "AppPlaySink5", "property.group.id" = "Test2GroupDoris", "property.client.id" = "AppPlayClientDoris", "kafka_partitions" = "0,1,2", "kafka_offsets" = "3669895231, 3113027716, 4683008086" );

vinlee19 commented 12 months ago

You can try to change property.group.id . This is my wechat: Aurora_0_618

kevinu2 commented 8 months ago

how to fix this?