Open Hyjaz opened 1 year ago
Do you have a CPU profile that could show where CPU time is being spent? You can use something like 0x to generate a flame graph, if you don't instrument your application with some APM solution. Ideally with a comparison to 2.2.3
Hello @Nevon, on the latest kafkajs version it seems to be coming from the scheduleCheckPendingRequest. I noticed that there was a change related to this in the requestQueue/index.js in the latest release. Let me know if you need more details.
This is v2.2.3. You can see there is a huge jump in the cpu usage between v2.2.4 and this one.
Thank you, that's what I suspected, but it's great to have some data to back it up. For reference, the change was introduced in #1532.
Also noticed this CPU increase in my app. In idle v2.2.3 uses almost no CPU, while v2.2.4 uses around 8%.
I kinda want to have the best of 2.2.3 and 2.2.4 but the CPU spike is way too much to upgrade currently, which is why we pinned it to 2.2.3
@Nevon i gave it a stab in the linked PR here https://github.com/tulios/kafkajs/pull/1572
We also have this issue after upgrading from kafkajs 1 -> latest. All services that have upgraded consume way more cpu and event loop iterations per second have increased 100X. After applying @MDSLKTR's fix as a patch this issue goes away. Would like to see this get merged asap.
Thank you @MDSLKTR !
Also noticed this CPU increase in our app. Using roughly 1.5/2x more CPU than previously
We have been using kafka-node for very long time, and decided to move to kafkajs for publishing to start with. We have a very high throughput logging system (~0.5 Million RPM) for a single micro service. Once we switched to kafakjs 2.2.4 the CPU spike as mentioned was too much to handle on resource side. So i can validate that @MDSLKTR findings effect the system in expected way. We switched it back to 2.2.3 and the resource utilisation came back to normal. We should think of patching this is next version. Attaching some system matrices from our production. ( First spike is when we switched from kafka-node to kafkajs 2.2.4, second downfall is when we dpeloyed kafkajs 2.2.3 )
Describe the bug When upgrading to v2.2.4 we saw a 190% increase in our CPU usage.
To Reproduce Not sure how you can reproduce it. We do however send thousand of messages per second.
If none of the above are possible to provide, please write down the exact steps to reproduce the behavior:
Expected behavior No cpu usage hike between v2.2.3 and v.2.2.4
Observed behavior A clear and concise description of what did happen. Please include any relevant logs with the log level set to debug.
Environment:
Additional context Add any other context about the problem here.