Open Alogani opened 4 months ago
Always so simple with windows...
Hi @Araq,
I need your review. I'm working on providing a common API for windows async (IOCP) and unix async (std/selectors) to allow future transformation of NimGo into our M:N goal (without having to refactor every high level API later).
I came up with that kind of signature for private API of selector:
proc select*(selector: Selector, timeoutMs: int): tuple[fd: AsyncFd, customData: pointer]
proc registerReadFile*(selector: Selector, fd: AsyncFd, customData: pointer,
buffer: pointer, size: int, bytesRead: ptr int,
cancellation: out proc())
## select function doesn't use windows' select, but `getQueuedCompletionStatus`,
## which return only one completed event at a time.
Which could be used like that :
proc readFile(fd: AsyncFd, buffer: var string, size: int, timeoutMs = -1) =
## This is only an example usage:
buffer = newString(size)
# Here we pass coroutine by simplicity, but we could pass a more complex object
var cancellableCoroutine = toOneShot((getCurrentCoroutine())
var customData = cast[pointer](cancellableCoroutine)
var bytesRead: int
var cancellationCb: proc()
registerReadFile(ActiveSelector, fd, customData, addr(result[0]), size, addr(bytesRead), cancellationCb)
if timeoutMs != -1:
registerTimer(cancellationCb, cancellableCoroutine, timeoutMs) ## -> this could be executed in a different thread
suspend() ## After suspend, we could be in another thread
## We could be either resumed due to timer cancellation or read operation was done
buffer.setLen(bytesRead)
It might not be very useful, but here is my draft of the implementation of registerReadFile
for windows (I don't show you the one for select, nothing fancy)
proc registerReadFile*(selector: Selector, fd: AsyncFd, customData: pointer,
buffer: pointer, size: int, bytesRead: ptr int,
cancellation: out proc()) =
var customOverlappedPtr = allocShared()
customOverlappedPtr[] = CustomOverlapped(customData: customData, bytesReadPtr: bytesRead)
var overlappedPtr = cast[ptr OVERLAPPED](customOverlappedPtr) # this is what std/asyncdispatch do
let success = bool(readFile(fd.Handle, buffer, size.int32, nil, overlappedPtr))
if not success:
let osError = osLastError()
if osError.int32 == ERROR_HANDLE_EOF:
# TODO: Add it inside the selector queue
discard # For now
else:
dealloc(customOverlappedPtr)
raiseOSError(osError)
cancellation = proc() = cancel(selector, overlappedPtr)
Do you think this is the correct approach ?
FYI ringabout has done something similar before, check out https://github.com/ringabout/wepoll https://github.com/ringabout/ioselectors and https://github.com/ringabout/httpx.
Thanks, that's interesting, I have been looking for something similar but the following limitation is a nogo:
https://github.com/ringabout/wepoll/tree/master/src/wepoll
Limitations Only works with sockets.
It also don't seem to support https://learn.microsoft.com/en-us/windows/win32/fileio/cancelio, which I need.
I think I found an appropriate design.
I'm currently writing the implementation for both IOCP and EPOLL, and it seems possible while maintaining performance.
type IOOperation
proc applyIO*(ioOperation: IOOperation, buffer: pointer): int
proc consume*(): pointer # return userdata, can be used for cancellation
proc select*(selector: Selector, timeoutMs: int): ptr IOOperation
proc readFileAsync*(selector: Selector, fd: AsyncFd, buffer: pointer, size: int, userData: pointer, ioOperation: var IOOperation)
And a cross platform usage
## Fiber 1
proc main() =
var s = newSelector()
var fd = AsyncFd(0)
registerHandle(s, fd)
var buffer = newString(1024)
var ioOperation: IOOperation
readFileAsync(s, fd, addr(buffer[0]), 10, cast[pointer](getCurrentCoroutine()), ioOperation)
suspend()
setLen(buffer, newOp.applyIO(addr(buffer[0]))
echo buffer
resume(newCoroutine(main))
## Fiber 2
while true:
var io = select(s, -1)
resume(io.consume())
Of course, because it use raw memory, so harder to debug, but shall fit well into a multithreading design
Do you think this is the correct approach ?
Good question. Unfortunately I don't know. However, in general we should try and avoid .closure
callbacks as the hidden state that is passed around is exactly the state that will cause problems for multi-threading.
Thanks, that's already a beginning
Good question. Unfortunately I don't know
API design is difficult, especially when that involves wrapping low level API, so I am not surprised ! And I bet, you are not much of an async guy ;-)
However, the fact you don't dismiss completly the design is a good sign I am not going the wrong way entirely ! I am confident I will come up with something
avoid .closure callbacks as the hidden state
Agreed, I will avoid that, I realized it cause issues even in single threaded design, and I think I can make state explicit without sacrificing too much performance.
I think it will likely take two weeks (sigh.) before I can write the whole implementation of Async API with most corner cases. So, although you could contribute on some of the multithreading design, you will likely be blocked until I finish this part (so it's up to you if you want to wait). I'll inform you then.
I come to the news after quite a long time and I excuse myself. Some of my time was taken away by other projects and I am not sure when I could advance more.
In my opinion, there is a problem in how std/selectors works, because by design it uses the blocking call, which doesn't allow to leverage multithreading behaviour. Furthermore, using blocking calls induce an important number of syscalls which can be largely reduced in both linux's select, poll, epoll and kqueue by using non blocking read with EAGAIN. Furthermore, epoll and kqueue have special flags to allow efficient multithreading (like EPOLLET). Here is an interesting article discussing about it : https://eklitzke.org/blocking-io-nonblocking-io-and-epoll. This problem affects any library relying on std/selectors like std/asyncdispatch and guzba/mummy, which is likely to have a strong overhead. It also creates a design flaw in NimGo where registering a file descriptor for both read and write result in a busy loop (I am not sure how asyncdispatch and mummy avoid that, do they ?)
I think the most elegant and efficient way to allow multithreading is to have a dedicated threadpool for IO (syscalls), and dedicated threadpool for CPU tasks. std/asyncdispatch could also benefit from it without change in its public API.
I have begun to write a draft of how a multithreaded selector library could look like here : https://github.com/Alogani/nectar/tree/devel. However, my competence on multithreading are lacking and the presence of multiple performance critical-section don't help (avoiding lock contention).
To the continuation of NimGo project:
very sorry to hear that, but community will appreciate your efforts
This is the last major work before work on multithreading (#21) could begin.
The goal is to create a portable API to query asynchronous events. This will likely significantly redesign the public API of the nimgo/eventdispatcher library, and hopefully the API won't require many changes after this update.