Open Ruees opened 1 year ago
Some abnormal information was found in be. info
1101 03:12:21.129379 2076520 task_worker_pool.cpp:725] failed to publish version|signature=5787540|transaction_id=5787540|error_tablets_num=100|error=[E-3115] I1101 03:12:21.129380 2076527 task_worker_pool.cpp:693] task elapsed 11 seconds since it is inserted to queue, it is timeout W1101 03:12:21.129392 2076527 task_worker_pool.cpp:725] failed to publish version|signature=5787551|transaction_id=5787551|error_tablets_num=100|error=[E-3115] I1101 03:12:21.129451 2076521 task_worker_pool.cpp:693] task elapsed 11 seconds since it is inserted to queue, it is timeout
Is it importing the mow table?
The thread dump you provided doesn't contain detailed information about the specific Flink CDC task or your application's code. However, I can offer some general guidance on how to approach the issue of high CPU usage on one of the BE nodes running Flink CDC tasks:
Analyze the High CPU Thread:
jstack
, jvisualvm
, or other profiling tools to capture thread dumps and gain insights into what the high CPU thread is doing. This will help you pinpoint the exact issue.Possible Causes of High CPU Usage:
Check Flink Configuration:
MySQL and Doris Synchronization:
Monitoring:
Scale Out:
Optimization:
Fine-Tuning:
Updates and Patches:
Consult Documentation and Community:
Without more detailed information, it's challenging to pinpoint the exact cause of the high CPU usage. You may need to investigate the application further and monitor its behavior to identify and resolve the issue. Additionally, consider involving your development and operations teams to collaborate on debugging and optimizing the system.
Search before asking
Version
1.2.6
What's Wrong?
There are three BE nodes in a resource group, and there have been Flink cdc tasks performing MySQL database synchronization doris. The CPU usage of two BE nodes is around 20%, but one BE node has a CPU usage rate of up to 90%. It should not be executing a Comparison because this state has been ongoing for a day. After performing stack tracing on this BE node, I obtained the following information
2023-11-01 09:52:55 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.351-b10 mixed mode):
"Attach Listener" #12 daemon prio=9 os_prio=0 tid=0x00007effc2c79800 nid=0x1fb212 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"Service Thread" #8 daemon prio=9 os_prio=0 tid=0x00007f00993a8800 nid=0x1fac8b runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x00007f0099129000 nid=0x1fac8a waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f0099128000 nid=0x1fac89 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f00efe1f000 nid=0x1fac88 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f00993a8000 nid=0x1fac87 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f00993a7000 nid=0x1fac86 in Object.wait() [0x00007f0095875000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f00eedd8800 nid=0x1fac85 in Object.wait() [0x00007f0095976000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method)
"main" #1 prio=5 os_prio=0 tid=0x00007f00eae8e000 nid=0x1fac79 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 tid=0x00007f00ed8e2800 nid=0x1fac84 runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f00ed8df000 nid=0x1fac7e runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f00ed8e0000 nid=0x1fac7f runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f00ed8e0800 nid=0x1fac80 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f00ed8e1000 nid=0x1fac81 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f00ed8e3000 nid=0x1fac8c waiting on condition
JNI global references: 351
What You Expected?
Identify the cause of high CPU usage on BE node and how to solve it
How to Reproduce?
No response
Anything Else?
No response
Are you willing to submit PR?
Code of Conduct