datafusion-contrib / arrow-zarr

Implementation of Zarr file format in Rust
Apache License 2.0
10 stars 1 forks source link

Use io_uring to read local files #24

Open maximedion2 opened 1 day ago

maximedion2 commented 1 day ago

https://docs.rs/io-uring/latest/io_uring/

Io_uring uses queues that are shared with the kernel to minimize system calls when doing local file io, let's try using that to reduce the time spent reading the zarr chunks.

maximedion2 commented 1 day ago

PR linked, I did some rough benchmarks, I will add comments to the code regarding some of the parameters I'm using, but essentially I'm seeing a significant speed up when reading files using io_uring compared to fs::read, something like 2x-4x, varies a lot depending on if I'm reading small chunks or larger chunks, I'm assuming for the former the overhead of system calls takes up a bigger fraction of the total time. For more realistic larger chunks, it's more like a 2x speed up, for now.

I wrote something very simple and didn't try to do tons of complicated stuff to really get the most out of io_uring just yet, I can come back to that later. I think there might be lower hanging fruits in terms of improving performance for now.