apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.52k stars 3.24k forks source link

[Bug] How to resolve TOO_MANY_TASKS error of ROUTINE LOAD in doris2.0.1? #24889

Open dudufan opened 1 year ago

dudufan commented 1 year ago

Search before asking

Version

doris 2.0.1 no_avx2

What's Wrong?

After creating ROUTINE LOAD, it occurs TOO_MANY_TASKS error. I tought this error maybe resolved by others in 4713 6342 , but I don't know how to resolve it.

CREATE ROUTINE LOAD demo.load_test
 PROPERTIES(
 "desired_concurrent_number"="1",
 "max_batch_interval"="20",
 "max_batch_rows"="300000",
 "max_batch_size"="209715200",
 "strict_mode" = "false",
 "format" = "json"
 )
 FROM KAFKA(
 "kafka_broker_list"="10.0.143.4:9092",
 "kafka_topic"="doris1",
 "property.group.id"="df-consumer",
 "property.client.id"="1234511",
 "property.kafka_default_offsets"="OFFSET_BEGINNING");

then show the load task, msg: 2023-09-26 08:48:15:errCode = 2, detailMessage = failed to send task: errCode = 2, detailMessage = failed to submit task. error code: TOO_MANY_TASKS, msg: (10.0.209.74)[TOO_MANY_TASKS]839b8a214383416e-9302587c0fc837b7_10.0.209.74

related log below:

fe.log

2023-09-25 16:37:38,623 INFO (thrift-server-pool-8|372) [ReportHandler.handleReport():196] receive report from be 14262. type: TASK, current queue size: 1
2023-09-25 16:37:38,623 INFO (Thread-49|99) [ReportHandler.taskReport():549] finished to handle task report from backend 14262, diff task num: 0. cost: 0 ms
2023-09-25 16:37:39,897 INFO (colocate group clone checker|89) [ColocateTableCheckerAndBalancer.matchGroup():309] finished to check tablets. unhealth/total/added/in_sched/not_ready: 0/0/0/0/0, cost: 0 ms
2023-09-25 16:37:41,333 INFO (txnCleaner|83) [DatabaseTransactionMgr.abortTransaction():1325] abort transaction: TransactionState. transaction id: 4446, label: a239dddb3fa04a86-9d42ef8956a200ac, db id: 14266, table id list: 0, callback id: 33793, coordinator: FE: 10.0.209.74, transaction status: ABORTED, error replicas num: 0, replica ids: , prepare time: 1695630996560, commit time: -1, finish time: 1695631061332, reason: timeout by txn manager successfully
2023-09-25 16:37:41,333 INFO (txnCleaner|83) [DatabaseTransactionMgr.removeExpiredAndTimeoutTxns():1839] transaction [4446] is timeout, abort it by transaction manager
2023-09-25 16:37:41,334 INFO (txnCleaner|83) [DatabaseTransactionMgr.abortTransaction():1325] abort transaction: TransactionState. transaction id: 4447, label: 9094910533c546e1-81d0d06a15cc6c4a, db id: 14266, table id list: 0, callback id: 33790, coordinator: FE: 10.0.209.74, transaction status: ABORTED, error replicas num: 0, replica ids: , prepare time: 1695630996562, commit time: -1, finish time: 1695631061333, reason: timeout by txn manager successfully
2023-09-25 16:37:41,334 INFO (txnCleaner|83) [DatabaseTransactionMgr.removeExpiredAndTimeoutTxns():1839] transaction [4447] is timeout, abort it by transaction manager

be.INFO

I0925 16:46:36.934396  1010 routine_load_task_executor.cpp:182] too many tasks in thread pool. reject task: 6617fd924e7b484b-9a5084014949055c, job id: 33790, queue size: 0, current tasks num: 10
W0925 16:46:36.934440  1010 backend_service.cpp:253] failed to submit routine load task. job id: 33790 task id: TUniqueId(hi=7356627321156749387, lo=-7327211452672440996)
I0925 16:46:41.368535  1010 task_worker_pool.cpp:267] successfully submit task|type=CLEAR_TRANSACTION_TASK|signature=4477|queue_size=1
I0925 16:46:41.368573  1010 task_worker_pool.cpp:267] successfully submit task|type=CLEAR_TRANSACTION_TASK|signature=4478|queue_size=2
I0925 16:46:41.368590  1010 task_worker_pool.cpp:267] successfully submit task|type=CLEAR_TRANSACTION_TASK|signature=4479|queue_size=3

be.waring.log...

W0925 17:38:09.983999 21052 local_file_system.cpp:299] [IO_ERROR]failed to list /proc/21007/fd/: (2), 没有那个文件或目录
W0925 17:38:09.984050 21052 doris_metrics.cpp:379] failed to count fd: [IO_ERROR]failed to list /proc/21007/fd/: (2), 没有那个文件或目录
W0925 17:38:39.179450 17220 backend_service.cpp:253] failed to submit routine load task. job id: 33790 task id: TUniqueId(hi=5910686550301887220, lo=-7511476450436058681)

What You Expected?

ROUTINE LOAD execute normally.

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

Code of Conduct

PliskinZhang commented 1 year ago

i found same question in 2.0.0. it's look like only create task but not release. how to fix it? only one running routine load.

be.info.log I0926 10:36:31.550918 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0, current tasks num: 1 I0926 10:36:31.550987 109916 routine_load_task_executor.cpp:285] begin to execute routine load task: id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0 I0926 10:36:31.551146 109916 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0 I0926 10:36:46.349925 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0, current tasks num: 2 I0926 10:36:46.349989 109917 routine_load_task_executor.cpp:285] begin to execute routine load task: id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0 I0926 10:36:46.350137 109917 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0 I0926 10:36:56.353701 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0, current tasks num: 3 I0926 10:36:56.353776 109918 routine_load_task_executor.cpp:285] begin to execute routine load task: id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0 I0926 10:36:56.354025 109918 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0 I0926 10:37:06.357452 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0, current tasks num: 4 I0926 10:37:06.357524 109919 routine_load_task_executor.cpp:285] begin to execute routine load task: id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0 I0926 10:37:06.357717 109919 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0 I0926 10:37:16.361900 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0, current tasks num: 5 I0926 10:37:16.361989 109920 routine_load_task_executor.cpp:285] begin to execute routine load task: id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0 I0926 10:37:16.362210 109920 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0 I0926 10:37:26.365837 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=c11f6ee0ce7b4485-bb1a3b4b23ef7fa2, job_id=44616, txn_id=77140, label=perf-44616-c11f6ee0ce7b4485-bb1a3b4b23ef7fa2-77140, elapse(s)=0, current tasks num: 6

dudufan commented 1 year ago

i found same question in 2.0.0. it's look like only create task but not release. how to fix it? only one running routine load.

be.info.log I0926 10:36:31.550918 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0, current tasks num: 1 I0926 10:36:31.550987 109916 routine_load_task_executor.cpp:285] begin to execute routine load task: id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0 I0926 10:36:31.551146 109916 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=91bcba040a1f44f4-bee8e85ef6bfd278, job_id=44616, txn_id=77135, label=perf-44616-91bcba040a1f44f4-bee8e85ef6bfd278-77135, elapse(s)=0 I0926 10:36:46.349925 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0, current tasks num: 2 I0926 10:36:46.349989 109917 routine_load_task_executor.cpp:285] begin to execute routine load task: id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0 I0926 10:36:46.350137 109917 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=10c422ec73ff4e9a-922ac2b418f447b6, job_id=44616, txn_id=77136, label=perf-44616-10c422ec73ff4e9a-922ac2b418f447b6-77136, elapse(s)=0 I0926 10:36:56.353701 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0, current tasks num: 3 I0926 10:36:56.353776 109918 routine_load_task_executor.cpp:285] begin to execute routine load task: id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0 I0926 10:36:56.354025 109918 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=70e3cd758f114301-92ebc5916cc117e2, job_id=44616, txn_id=77137, label=perf-44616-70e3cd758f114301-92ebc5916cc117e2-77137, elapse(s)=0 I0926 10:37:06.357452 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0, current tasks num: 4 I0926 10:37:06.357524 109919 routine_load_task_executor.cpp:285] begin to execute routine load task: id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0 I0926 10:37:06.357717 109919 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=61020b11263e44f7-a977a4130fc4f0bd, job_id=44616, txn_id=77138, label=perf-44616-61020b11263e44f7-a977a4130fc4f0bd-77138, elapse(s)=0 I0926 10:37:16.361900 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0, current tasks num: 5 I0926 10:37:16.361989 109920 routine_load_task_executor.cpp:285] begin to execute routine load task: id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0 I0926 10:37:16.362210 109920 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=b25cc07bb21d4717-a1d403f71596d8e1, job_id=44616, txn_id=77139, label=perf-44616-b25cc07bb21d4717-a1d403f71596d8e1-77139, elapse(s)=0 I0926 10:37:26.365837 111145 routine_load_task_executor.cpp:267] submit a new routine load task: id=c11f6ee0ce7b4485-bb1a3b4b23ef7fa2, job_id=44616, txn_id=77140, label=perf-44616-c11f6ee0ce7b4485-bb1a3b4b23ef7fa2-77140, elapse(s)=0, current tasks num: 6

I don't know why, but it worked when increasing this two parameter: (fe)max_routine_load_task_num_per_be , (be)routine_load_thread_pool_size

PliskinZhang commented 1 year ago

i deploy the test env is 1fe+1be. i found when i increasing max_batch_interval=60(max),it's looks like be nomal. but consume very slow , about 400 row per times. but i don't know why I0926 11:37:36.238286 172134 data_consumer.cpp:234] kafka consume timeout: 63480f09a42d6165-591140e9b9db3fae I0926 11:37:37.984771 172134 data_consumer.cpp:234] kafka consume timeout: 63480f09a42d6165-591140e9b9db3fae I0926 11:37:39.251384 172134 data_consumer.cpp:234] kafka consume timeout: 63480f09a42d6165-591140e9b9db3fae I0926 11:37:40.251472 172134 data_consumer.cpp:234] kafka consume timeout: 63480f09a42d6165-591140e9b9db3fae I0926 11:37:40.251492 172134 data_consumer.cpp:257] kafka consumer done: 63480f09a42d6165-591140e9b9db3fae, grp: a648ee7bb83177ec-d82f3befe86d3cb3. cancelled: 0, left time(ms): -776, total cost(ms): 60776, consume cost(ms): 60774, received rows: 390, put rows: 390 I0926 11:37:40.251507 172134 data_consumer_group.cpp:87] all consumers are finished. shutdown queue. group id: a648ee7bb83177ec-d82f3befe86d3cb3 I0926 11:37:40.251523 160128 data_consumer_group.cpp:131] consumer group done: a648ee7bb83177ec-d82f3befe86d3cb3. consume time(ms)=60776, received rows=390, received bytes=42380, eos: 1, left_time: -776, left_rows: 299610, left_bytes: 209672820, blocking get time(us): 60749899, blocking put time(us): 156, id=f46cc7a6026a448f-b398133ce6e0eed1, job_id=44616, txn_id=77200, label=perf-44616-f46cc7a6026a448f-b398133ce6e0eed1-77200, elapse(s)=60

CalvinKirs commented 1 year ago

Hi, I would like to confirm if you are using routine load multi-table import, if you have a small number of jobs this may be another issue. If you have a large number the routine load jobs, you can:

  1. Reduce concurrency by lowering max_routine_load_task_concurrent_num (fe.config)
  2. Increase slots by increasing max_routine_load_task_num_per_be & routine_load_thread_pool_size. (be.config)

And it should be noted that max_routine_load_task_num_per_be must be less than routine_load_thread_pool_size.

PliskinZhang commented 1 year ago

Hi, I would like to confirm if you are using routine load multi-table import, if you have a small number of jobs this may be another issue. If you have a large number the routine load jobs, you can:

  1. Reduce concurrency by lowering max_routine_load_task_concurrent_num (fe.config)
  2. Increase slots by increasing max_routine_load_task_num_per_be & routine_load_thread_pool_size. (be.config)

And it should be noted that max_routine_load_task_num_per_be must be less than routine_load_thread_pool_size.

yes,im using routine load multi-table import i set max_routine_load_task_concurrent_num = 1 ,but in be.log the task num still increase to routine_load_thread_pool_size,why? 1fe 1be,should i restart be?

PliskinZhang commented 1 year ago

Hi, I would like to confirm if you are using routine load multi-table import, if you have a small number of jobs this may be another issue. If you have a large number the routine load jobs, you can:

  1. Reduce concurrency by lowering max_routine_load_task_concurrent_num (fe.config)
  2. Increase slots by increasing max_routine_load_task_num_per_be & routine_load_thread_pool_size. (be.config)

And it should be noted that max_routine_load_task_num_per_be must be less than routine_load_thread_pool_size.

yes,im using routine load multi-table import i set max_routine_load_task_concurrent_num = 1 ,but in be.log the task num still increase to routine_load_thread_pool_size,why? 1fe 1be,should i restart be?

put this config to fe.conf,not work.when i resume the multi-table routine load, curent tasks num can't remain one job task per be

dudufan commented 1 year ago

Hi, I would like to confirm if you are using routine load multi-table import, if you have a small number of jobs this may be another issue. If you have a large number the routine load jobs, you can:

  1. Reduce concurrency by lowering max_routine_load_task_concurrent_num (fe.config)
  2. Increase slots by increasing max_routine_load_task_num_per_be & routine_load_thread_pool_size. (be.config)

And it should be noted that max_routine_load_task_num_per_be must be less than routine_load_thread_pool_size.

Yes, I also use multi-table import, and under this occasion Increase slots can work for a while. After setting be multi_table_batch_plan_threshold=5 in be.conf,it is ok. But I still dont know why...

CalvinKirs commented 1 year ago

If you use a multi-table routineload and the amount of data is particularly small, you can adjust this parameter. multi_table_batch_plan_threshold, usually if there are very few tables, you can adjust it to 5 or less.

PliskinZhang commented 1 year ago

If you use a multi-table routineload and the amount of data is particularly small, you can adjust this parameter. multi_table_batch_plan_threshold, usually if there are very few tables, you can adjust it to 5 or less.

if i have many tables,more than 200,how to do?may be use another way to put data? use all config above,not useful,routine load task stiil increase slowly althought i stop the load,but pool not release.

PliskinZhang commented 1 year ago

i found that normal routine load is ok. but muti-table routine load has problem.

noramal log: I0927 14:12:39.462114 351714 routine_load_task_executor.cpp:267] submit a new routine load task: id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0, current tasks num: 2 I0927 14:12:39.462155 159997 routine_load_task_executor.cpp:285] begin to execute routine load task: id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0 I0927 14:12:39.462463 159997 stream_load_executor.cpp:71] begin to execute job. label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, txn_id=82139, query_id=5f31ac24344e4c47-abfd3f3efe9b2c58 I0927 14:12:39.462525 159997 fragment_mgr.cpp:689] query_id: 5f31ac24344e4c47-abfd3f3efe9b2c58 coord_addr TNetworkAddress(hostname=192.168.1.37, port=9020) total fragment num on current host: 0 I0927 14:12:39.462541 159997 fragment_mgr.cpp:758] Register query/load memory tracker, query/load id: 5f31ac24344e4c47-abfd3f3efe9b2c58 limit: 2.00 GB I0927 14:12:39.463045 159997 data_consumer_group.cpp:111] start consumer group: 0e4573b9099d6821-ea027b26758a5bbd. max time(ms): 60000, batch rows: 200000, batch size: 209715200. id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=0 I0927 14:12:39.463085 159961 fragment_mgr.cpp:528] PlanFragmentExecutor::_exec_actual|query_id=5f31ac24344e4c47-abfd3f3efe9b2c58|instance_id=5f31ac24344e4c47-abfd3f3efe9b2c59|pthread_id=140232152180480 I0927 14:12:39.463495 160763 tablets_channel.cpp:103] open tablets channel: (load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, index_id=160425), tablets num: 128, timeout(s): 120 I0927 14:13:40.305856 159997 data_consumer_group.cpp:131] consumer group done: 0e4573b9099d6821-ea027b26758a5bbd. consume time(ms)=60842, received rows=30, received bytes=3211, eos: 1, left_time: -842, left_rows: 199970, left_bytes: 209711989, blocking get time(us): 60842733, blocking put time(us): 2, id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=60 I0927 14:13:40.306563 159961 vtablet_sink.cpp:894] VNodeChannel[160425-130002], load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, txn_id=82139, node=192.168.1.37:8060 mark closed, left pending batch size: 1 I0927 14:13:40.307304 160596 tablets_channel.cpp:145] close tablets channel: (load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, index_id=160425), sender id: 0, backend id: 130002 I0927 14:13:40.307919 161034 vtablet_sink.cpp:1121] all node channels are stopped(maybe finished/offending/cancelled), sender thread exit. 5f31ac24344e4c47-abfd3f3efe9b2c58 I0927 14:13:40.324090 160596 load_channel.cpp:46] load channel removed. mem peak usage=0, info=label: LoadChannel#senderIp=192.168.1.37#loadID=5f31ac24344e4c47-abfd3f3efe9b2c58; consumption: 0; peak_consumption: 0; , load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, is high priority=1, sender_ip=192.168.1.37 I0927 14:13:40.325825 159961 vtablet_sink.cpp:1528] total mem_exceeded_block_ns=0, total queue_push_lock_ns=0, total actual_consume_ns=115047, load id=5f31ac24344e4c47-abfd3f3efe9b2c58 I0927 14:13:40.325846 159961 vtablet_sink.cpp:1572] finished to close olap table sink. load_id=5f31ac24344e4c47-abfd3f3efe9b2c58, txn_id=82139, node add batch time(ms)/wait execution time(ms)/close time(ms)/num: {130002:(17)(0)(19)(1)} I0927 14:13:40.326172 159961 query_context.h:69] Deregister query/load memory tracker, queryId=5f31ac24344e4c47-abfd3f3efe9b2c58, Limit=2.00 GB, CurrUsed=-24.48 KB, PeakUsed=25.31 MB I0927 14:13:40.365257 159997 routine_load_task_executor.cpp:257] finished routine load task id=5f31ac24344e4c47-abfd3f3efe9b2c58, job_id=160686, txn_id=82139, label=health-160686-5f31ac24344e4c47-abfd3f3efe9b2c58-82139, elapse(s)=60, status: [OK], current tasks num: 2

muti-table log: I0927 14:21:22.827615 351714 routine_load_task_executor.cpp:267] submit a new routine load task: id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0, current tasks num: 4 I0927 14:21:22.827625 159997 routine_load_task_executor.cpp:285] begin to execute routine load task: id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0 I0927 14:21:22.828194 159997 routine_load_task_executor.cpp:296] recv single-stream-multi-table request, ctx=id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0 I0927 14:21:22.828397 159997 data_consumer_group.cpp:111] start consumer group: c74840f8c925b3cf-de3a780dee8213a6. max time(ms): 30000, batch rows: 200000, batch size: 209715200. id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=0 I0927 14:21:52.831889 159997 data_consumer_group.cpp:131] consumer group done: c74840f8c925b3cf-de3a780dee8213a6. consume time(ms)=30003, received rows=0, received bytes=0, eos: 1, left_time: -3, left_rows: 200000, left_bytes: 209715200, blocking get time(us): 30003469, blocking put time(us): 0, id=71ae9956abab417a-a9d80e07f0150f21, job_id=161003, txn_id=82149, label=perf-161003-71ae9956abab417a-a9d80e07f0150f21-82149, elapse(s)=30

not yet finish last routine load,new task has submit.so pool num be more and more. why? i wait for a long time no finish log appear