Kotlin / kotlinx-io

Kotlin multiplatform I/O library
Apache License 2.0
1.16k stars 56 forks source link

Better segment pools #311

Open fzhinkin opened 2 months ago

fzhinkin commented 2 months ago

Currently, segment pools exist only on JVM (on other platforms, implementations are effectively no-op) and behave more like caches than pools.

There are a few directions in which we may/should develop segment pools:

This is an epic describing what could be done and tracking progress rather than an instruction to what should be implemented.

fzhinkin commented 1 month ago

The first two points (not returning segments back to the pool and small pool size) are blockers for integration with Ktor (https://youtrack.jetbrains.com/issue/KTOR-6030, https://github.com/ktorio/ktor/pull/4032), so they need to be fixed.

e5l commented 1 month ago

It would be great to have a system property to adjust the pool size as well

fzhinkin commented 1 month ago

As @bjhham pointed out, Buffer::close should release segments (currently, it's no-op).

fzhinkin commented 1 month ago

However, it could be problematic, as a typical Buffer use scenario is "allocate, use, and forget".

e5l commented 1 month ago

Maybe we can try to fix this by manual allocation buffer for writing using a different constructor method, so the buffers for channels and primitives in Ktor will be tracked

fzhinkin commented 1 month ago

make sure every segment is returned to a pool once there are no users remaining

One of the scenarios when a segment is not returned to the pool is when is was shared between multiple buffers. Currently, it's tracked by a flag, so there's no way to check if a segment can be safely returned back to the pool.

Replacing a flag with a ref-counter solves the issue and the performance impact seems neglectable.

fzhinkin commented 1 month ago

The issue with changing the default pool size is how this property is used: currently, the pool consists of multiple chunks (the number depends on CPU count), and the property applies to each chunk individually. In an unlucky scenario, we may end up with all buckets being filled up with pooled segments, but only one of them will be used.

This place is tricky in terms of performance, I need to research a bit more what could be done.

e5l commented 1 month ago

You may try using the same strategy as for connection pool with lazy allocation and releasing allocated buffers through time

fzhinkin commented 3 weeks ago

Currently, I'm gravitating towards a solution with a two-tier segment pool:

fzhinkin commented 3 weeks ago

Some of the issues from the summary are addressed here: #352