Implement chunk iterators that drop the GIL

Congyuwang / RocksDict

Python fast on-disk dictionary / RocksDB & SpeeDB Python binding

https://congyuwang.github.io/RocksDict/rocksdict.html

MIT License

173 stars 8 forks source link

Implement chunk iterators that drop the GIL #106

Closed GodTamIt closed 7 months ago

GodTamIt commented 8 months ago

This allows callers to get chunks at a time from the iterator and in such scenarios, it makes sense to drop the GIL.

Dropping the GIL is not only useful for scenarios when there are multiple iterators but when code may be calling other GIL-dropping code.

Benchmarking Rdict Iterator...
Iterator performance: 4194304 items in 1.7615967919118702 seconds (2380967.0971572897 it/s)
Iterator performance multi-thread: 16777216 items in 6.953388374997303 seconds (2412811.5812323666 it/s)

Benchmarking Rdict Batch Iterator...
Batched iterator performance: 16777216 items in 12.566605832893401 seconds (1335063.4390143137 it/s)
Batched iterator performance multi-thread: 16777216 items in 9.088478291872889 seconds (1845987.3546710834 it/s)

Congyuwang commented 8 months ago

I think it's better to provide a default limited chunk size rather than using None as default--doesn't seem like a default behaviour.

Congyuwang commented 8 months ago

Iter_chunk still seems significantly slower, although the chunk size is pretty large (25000).

GodTamIt commented 8 months ago

Iter_chunk still seems significantly slower, although the chunk size is pretty large (25000).

That's because it's doing 4x more work for less than 4x time

GodTamIt commented 8 months ago

I think it's better to provide a default limited chunk size rather than using None as default--doesn't seem like a default behaviour.

This is now done.

Iter_chunk still seems significantly slower, although the chunk size is pretty large (25000).

@Congyuwang iter chunk is faster. It's probably more helpful to look at the it/s metric than the total seconds because the number of items used to be different between single-thread and multi-thread. I've updated the benchmark to not do this.

Congyuwang commented 8 months ago

I'm comparing iter vs. iter_chunk. Not multithreaded vs. single threaded. Seems that based on the previous benchmark, multithreaded iter chunk is slower than multithreaded iter where GIL is not released.

Congyuwang commented 8 months ago

Looks like we have some kind of deadlock.