fd4s / fs2-kafka

Functional Kafka Streams for Scala
https://fd4s.github.io/fs2-kafka
Apache License 2.0
296 stars 101 forks source link

Performance regression Producer 3.5.0 vs 3.4.0 #1321

Closed atnoya closed 7 months ago

atnoya commented 7 months ago

We have upgraded fs2-kafka to 3.5.0 from 3.4.0, and it seems the performance and CPU usage got a serious hit.

Screenshot 2024-04-24 at 14 23 04 Screenshot 2024-04-24 at 14 23 34 Screenshot 2024-04-24 at 14 47 37

In the screenshots above, you can see when we switch back to 3.4.0 at 14:18

Looking at the list of changes between 3.4.0 and 3.5.0, we tried downgrading only fs-core to 3.9.4 as we thought could be the root cause, but still got terrible performance.

It looks to me, but I don't know the internals of cats-effect or fs2 that well, that the problem might be in the change from Sync[F].blocking to Sync[F].interruptible, as from my limited experience, I don't see how any of the other changes could cause this.

I am happy to provide more information if I can. I will try to test my hypothesis above in the meantime. Will report results If I find anything.

abestel commented 7 months ago

I reached the exact same conclusion this afternoon, this PR https://github.com/fd4s/fs2-kafka/pull/1126 seems to be the culprit (tested by reverting it and publishing a local version).

aartigao commented 7 months ago

Wow... I'm astonished...

aartigao commented 7 months ago

I'm also not well versed on the CE internals, and by looking at the docs for interruptible it didn't seem to hurt, that's why I merged that PR 😢

OFC, I'm going to create a fix for this now, but now I'm really curious of why such a performance drop 🤔 cc @armanbilge was that expected?

aartigao commented 7 months ago

Maybe interruptible doesn't shift to the blocking pool?

atnoya commented 7 months ago

https://github.com/typelevel/cats-effect/blob/769a89ef5d39f35d3a2cd00ffedbd22c91df48cc/core/jvm/src/main/scala/cats/effect/IOFiberPlatform.scala#L28

I can see some complex logic there, including semaphores, AtomicRefs and busy waits not running in the Blocking TP (can explain the CPU increased usage)?

https://github.com/typelevel/cats-effect/blob/769a89ef5d39f35d3a2cd00ffedbd22c91df48cc/core/jvm/src/main/scala/cats/effect/IOFiberPlatform.scala#L177

But it does seem that the action is run in the Blocking TP.

atnoya commented 7 months ago

We can possibly open an issue for clarification in the cats-effect repo.

atnoya commented 7 months ago

Btw thanks a million for the quick reaction and the release 🙇

atnoya commented 7 months ago

Just dropping confirmation, the new release 3.5.1 fixes the issue:

Screenshot 2024-04-25 at 12 34 48 Screenshot 2024-04-25 at 12 34 55 Screenshot 2024-04-25 at 12 35 03

I am good to close the issue. Unless you want to keep it open for tracking the interruptible issue.

aartigao commented 7 months ago

It's fine to close it. Thank you!