apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.51k stars 3.53k forks source link

[C++] Create I/O thread pools per filesystem #30552

Open asfimport opened 2 years ago

asfimport commented 2 years ago

The IOContext gets us most of the way here but we still don't do this yet today. One concrete advantage to this is it allows us to more intelligently set the number of I/O threads.

For example, 8 threads is often too small for an S3 filesystem (ARROW-14965) On the other hand, in some cases, 8 threads can be too many for an HDD (ARROW-14354)

I doubt we will be able to figure out the ideal size of the I/O thread pool for any filesystem (e.g. on an S3 filesystem it depends on how many cores you have and how much bandwidth the system has) but we can possibly have more sensible defaults.

Furthermore, it will hopefully clarify to the user the connection between filesystem and I/O thread pool size.

Reporter: Weston Pace / @westonpace

Note: This issue was originally created as ARROW-15035. Please see the migration documentation for further details.

asfimport commented 2 years ago

Todd Farmer / @toddfarmer: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.