zio / zio

ZIO — A type-safe, composable library for async and concurrent programming in Scala
https://zio.dev
Apache License 2.0
4.07k stars 1.28k forks source link

ZIO.timeoutTO has tremendous overhead #9211

Open eyalfa opened 2 hours ago

eyalfa commented 2 hours ago

I've added couple of benchmark comparing simple effect evaluation, same effect with a large timeout (hence not failing on TO) and a (surprising 😎 ) alternative timeout implementation.

benchmarks results:

Benchmark                     (n)   Mode  Cnt         Score        Error  Units
AwaitBenchmark.zioBaseline  10000  thrpt   15  15102658.764 ± 655913.156  ops/s
AwaitBenchmark.zioTimeout   10000  thrpt   15     80550.279 ±   7204.177  ops/s
AwaitBenchmark.zioTimeout1  10000  thrpt   15   1200702.972 ±  42561.208  ops/s

The original implementation is based on raceFibersWith which basically forks the self affect and a sleep effect, the idea behind the alternative implementation is to eliminate the second fiber and replace it with clock.scheduler.schedule and control the entire thing using ZIO.asyncInterrupt.

As seen in the benchmark results, I already have a branch with a crude implementation, I'd be happy to hear some feedback on the idea, and (even better) receive some additional benchmarking suggestions.

eyalfa commented 2 hours ago

CC: @kyri-petrou

...I also have a thought about eliminating the fibers altogether, but this requires some more work at this point.

ghostdogpr commented 2 hours ago

Yeah it's kind of a known issue: https://github.com/zio/zio/issues/7628

Zio 1 handled race as a special case but that was lost in zio 2.

eyalfa commented 2 hours ago

@ghostdogpr will u consider accepting this as a PR (I'll fill in on the referred issue later on)

ghostdogpr commented 2 hours ago

Wouldn't it be better to make race faster rather than just timeout?

eyalfa commented 2 hours ago

first of all yes, let's make it faster! however, timeoutXXX operators are special as they don't really need the second fiber (I suspect I can get rid of the first one as well 😎 ) so I think they deserve their own optimization.

ghostdogpr commented 2 hours ago

Sure, why not 😄

eyalfa commented 2 hours ago

btw, I started investigating this from the streams angle. stream.timeoutXXX is implemented in terms of pull.timeoutXXX which basically forks per pull, effectively destroying the stream's performance.

notice Akka's impl doesn't suffer from this (it duffers from other things as it's not batching...) since adding a timeout just means the underlying actor has to react to another message

eyalfa commented 2 hours ago

btw, I suspect moving race into FiberRuntime won't solve the issue here since the dominant part here is forking these two fibers, not reacting to their completion.

I also suspect there are other scenarios where at least one of the fibers can be eliminated, consider scenarios where one fiber is already running and now we're starting an operation that's supposed to race with it. This happens in ZChannel.mergeWith where ZIO actually forks and races two Fiber.join effects (I did partly optimize this one with a poll), I believe there's no real need for the two 'joiners' in this case.