apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.12k stars 3.57k forks source link

[improve][client]pulsar-client-io thread has high CPU usage when client idle and batch message enable #23187

Open stillerrr opened 4 weeks ago

stillerrr commented 4 weeks ago

Search before asking

Motivation

Create a pulsar client and to initialize, send a little message to topic, then the pulsar-client-io thread(just 1 thread by default) cpu usage is 14%, but client is idle, no action for it. image

the root cause is that batch message sending is enable, ProducerImpl would use pulsar-client-io thread to send message per 1ms by default image

pulsar-client use EventLoopGroup as thread pool, the thread name is "pulsar-client-io",EpollEventLoopGroup is preferred. When switch to NioEventLoopGroup, the CPU usage just reduce little to 9.6% image

Solution

Use ScheduledExecutorService to send batch message to replace the EventLoopGroup. It would reduce CPU usage less than 1% image

Alternatives

No response

Anything else?

No response

Are you willing to submit a PR?

lhotari commented 3 weeks ago

@stillerrr When the client is idle, batchFlushTasks aren't scheduled in the Pulsar client code. The current problem description where it says "when client idle" doesn't make sense to my.

lhotari commented 3 weeks ago

It could be useful to use a real profiler sync as https://github.com/async-profiler/async-profiler to pinpoint the problem.

stillerrr commented 3 weeks ago

It could be useful to use a real profiler sync as https://github.com/async-profiler/async-profiler to pinpoint the problem.

image EventLoopGroup is always running, and NIOEventLoopGroup use less CPU (about 9%) than EpollEventLoopGroup

stillerrr commented 3 weeks ago

@stillerrr When the client is idle, batchFlushTasks aren't scheduled in the Pulsar client code. The current problem description where it says "when client idle" doesn't make sense to my.

EventLoopGroup do scheduled task would use more CPU than ScheduledExecutorService

"when client idle" means few requests, maybe 1000 request per second, and it would cause batchFlushTasks scheduled

lhotari commented 1 week ago

"when client idle" means few requests, maybe 1000 request per second, and it would cause batchFlushTasks scheduled

well, it's not really idling if there's 1000 requests per second. I think that the title of this issue is very misleading.