lifepuzzlefun commented 1 year ago

Motivation

As we know. Pulsar is a system written in asynchronous style. In previous version, there were some cases which will cause thread blocking for a long time. This PIP want to introduce some lightweight apm mechanism to check if thread is blocking for a long time.

Goal

monitor most of pulsar threads.
low overhead.
support dynamic enable/disable.

API Changes

no change of API.

Implementation

Basic idea

For single thread executor. there is a task queue for the thread. so if a task insert into the queue and executed. we think at least the thread is not blocking. Otherwise, the task will stay in the queue for very long time.

So we can register the executor pool to the monitor class and submit ping task periodicly. if task is executed. we update the
thread lastActiveTimestamp and check if there were some threads whose active timestamp is not refreshed in time. Then we think perhaps the thread is blocking.

For multi thread executor. Like Executors.newScheduledThreadPool(100) the task queue is shared among all the threads in the pool. so the ping task is not guaranteed to execute on different thread every time. and the lastActiveTimestamp is not refreshed in time.

To monitor those threads in the multi thread executor. each thread will have a runningTask state marker. if lastActiveTimstamp is not refreshed in time and runningTask is false we think the thread is just health to run task.

JDK ThreadPoolExecutor.beforeExecute(Thread t, Runnable r) is called before each task submit in the task queue. and ThreadPoolExecutor.afterExecute(Runnable r, Throwable t) is called after task is finished. So we can override those methods to update runningTask state of the threads in pool.

Current Pulsar Thread Category

Single thread executor with one queue bind on one thread. Netty EventLoop & OrderedExecutor & OrderedScheduler & Executors.newSingleThreadScheduledExecutor
Multi thread executor with one queue bind mult-threads. WebExecutorThreadPool used by pulsar-web and pulsar-websocket and those created by Executors.newXXXX
ForkJoinPool.commonPool used by CompletableFuture callbacks.

Changes

`ThreadMonitor` and `ThreadPoolMonitor` will be responsible for the main logic.

1. `ThreadMonitor`

will have a Map<threadId, threadMonitorState{lastActiveTimestamp,runningTask}> to record the monitor state.

2. `ThreadPoolMonitor`

will schedule periodicly ping tasks to all registered executor pool.
and also there were some logic about dynamic enable/disable/reconfiguration the monitor logic. Reconfiguration logic is running by schedule tasks to read the system property to check if the monitor config is changed.
check blocking thread and dump stack in the log. If the condition (lastActiveTimestamp not refresh in time and runningTask == false）.Those thread stack will be dumped using ThreadMXBean provide by JDK.
exit thread cleanup. a schedule task for cleaning exited thread recorded in ThreadMonitor.

3. `ExecutorProvider` and `ScheduledExecutorProvider`

now all executor and schedule executor will be create by these class. eg. Executors.newXXXXX -> ExecutorProvider.newXXXX Executors.newScheduleXXXX -> ScheduleredExecutorProvider.newXXXXX
those executors created by Provider will be registered in the ThreadPoolMonitor automatically.
Also EventLoopGroup created by EventLoopUtil will be registered in the ThreadPoolMonitor automatically.

Metrics

pulsar_thread_last_active_timestamp_ms{threadName="${threadName},tid="${tid},runningTask="${runningTask}"} will be output in the promethues metric endpoint.
thread_blocked_count, thread_blocked_time_ms, thread_waited_count, thread_waited_time_ms will be output in the prometheus metric endpoint. if TheadMXBean.isThreadContentionMonitoringSupported return true.

see https://docs.oracle.com/javase/8/docs/api/java/lang/management/ThreadInfo.html for more information. note that those are counters and won't be reset when thread return from block or wait state. Also when the thread have no task to do will also cause those counter to add up.

Configuration

For current implementation all the config is set by system property. And there were some tools can change system property in the runtime.

pulsar.thread-pool.monitor.enabled if schedule ping task to registerd executors.
pulsar.thread-pool.monitor.check.ms interval to schedule ping task to registered executors.
pulsar.thread-pool.monitor.blocking.dump.enabled if dump blocking threads.
pulsar.thread-pool.monitor.blocking.dump.check.ms interval to check if there were thread lastActiveTimestamp not refreshed in time.
pulsar-thread-pool.monitor.blocking.dump.threshold.ms >= this value will think thread is blocking.
pulsar-thread-pool.monitor.blocking.dump.stack-max-depth thread max dump stack depth.

Left for discussion

ForkJoinPool is not easy to use the method as mentioned above.

Most CompletableFuture callback use ForkJoinPool.commonPool is not easy to replace or override callback method.
If we implement our own CompletableFuture to trace each callback pending in the queue. maybe this will introduce a lot overhead which will harm the performance.

see https://stackoverflow.com/questions/61759182/monitoring-debugging-and-tracking-completablefuture-tasks for more information.

Alternatives

No response

Anything else?

No response

lhotari commented 1 year ago

@lifepuzzlefun Have you considered integrating existing tooling such as Reactor BlockHound to detect blocking code that is run on threads which should be non-blocking?

These BlockHound docs explains how to write an integration for BlockHound: https://github.com/reactor/BlockHound/blob/master/docs/custom_integrations.md and https://github.com/reactor/BlockHound/blob/master/docs/customization.md

Essentially, it's about configuring which threads should be non-blocking and about white-listing some known violations.

For example, Netty already comes with a BlockHound integration: https://github.com/netty/netty/blob/f027fa2df77af7719aa9686659633ee0fc73ebf9/common/src/main/java/io/netty/util/internal/Hidden.java#L31-L186

lhotari commented 1 year ago

I replied on the mailing list thread: https://lists.apache.org/thread/p76y2p4cst06nmo8yzcw8pyg19x6rtnt

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.

apache / pulsar

PIP-232: Introduce thread monitor to check if thread is blocked for long time. #18985

Motivation

Goal

API Changes

Implementation

Basic idea

Current Pulsar Thread Category

Changes

`ThreadMonitor` and `ThreadPoolMonitor` will be responsible for the main logic.

1. `ThreadMonitor`

2. `ThreadPoolMonitor`

3. `ExecutorProvider` and `ScheduledExecutorProvider`

Metrics

Configuration

Left for discussion

Alternatives

Anything else?

apache / pulsar

PIP-232: Introduce thread monitor to check if thread is blocked for long time. #18985

Motivation

Goal

API Changes

Implementation

Basic idea

Current Pulsar Thread Category

Changes

ThreadMonitor and ThreadPoolMonitor will be responsible for the main logic.

1. ThreadMonitor

2. ThreadPoolMonitor

3. ExecutorProvider and ScheduledExecutorProvider

Metrics

Configuration

Left for discussion

Alternatives

Anything else?

`ThreadMonitor` and `ThreadPoolMonitor` will be responsible for the main logic.

1. `ThreadMonitor`

2. `ThreadPoolMonitor`

3. `ExecutorProvider` and `ScheduledExecutorProvider`