chriskohlhoff / asio

Asio C++ Library
http://think-async.com/Asio
4.95k stars 1.22k forks source link

IOCP on Windows to support SetFileIoOverlappedRange for OVERLAPPED structs... #79

Open leonleon77 opened 9 years ago

leonleon77 commented 9 years ago

Was wondering about various issues (pro/against) adding support for SetFileIoOverlappedRange on Windows... https://msdn.microsoft.com/en-us/library/windows/desktop/aa365540%28v=vs.85%29.aspx

Ext3h commented 1 year ago

Cons:

Pros:

Traps:

When evaluating whether it's actually giving you any performance benefits, it's necessary to first ensure that you actually have some base load for the critical paths. A concurrent thread with lots of large dynamic memory allocations on the heap is a prime candidate, because it's hitting the same process / system global page table mutex.

On a perfectly idle system where only the IOCP APIs are used by a single thread, you will easily max out any other possible bottleneck, before SetFileIoOverlappedRange becomes relevant.

On a contended system, a fraction of of that becomes realistic when not using SetFileIoOverlappedRange - and a profiler points you to MmProbeAndLockPages and within to KeYieldProcessorEx - you are hitting a nasty spinlock. The IO APIs are not at fault for holding it too long, they just get hung up on it. And as with any spinlock - the more threads you throw at it, the worse it becomes.


So, to put it short: If you don't need SetFileIoOverlappedRange then you won't notice. When you need it, you will hit a wall without it.

It's only solving the "high bandwidth" part for direct I/O though.

You are still going to hit an IOPS limit eventually, and for that metric IoRing (see #1200) would be the way to go. Which coincidentally covers both registered buffers and high IOPS. But in return is not IOCP / Overlapped related at all.

Emjayen commented 2 months ago

Just in-case anyone stumbles upon this: much of the above is incorrect.

[^1]: This isn't strictly true, as it depends on the I/O buffering mode and negotiation with the uppermost driver. [^2]: RIO is rather borked in respect to performance, and has been for years (and seemingly not maintained anymore)