mitchellh / libxev

libxev is a cross-platform, high-performance event loop that provides abstractions for non-blocking IO, timers, events, and more and works on Linux (io_uring or epoll), macOS (kqueue), and Wasm + WASI. Available as both a Zig and C API.
MIT License
1.97k stars 65 forks source link

Add support for watching processes with IOCP #73

Closed kcbanner closed 10 months ago

kcbanner commented 10 months ago

These changes add support for receiving events from JobObjects (https://learn.microsoft.com/en-us/windows/win32/procthread/job-objects) in the IOCP backend.

This feature is then used to implement an IOCP process watcher.

JobObjects are unique with the way they interact with the OVERLAPPED_ENTRY result structure - they completely repurpose all the fields to mean different things. This means I fill in the completion result directly before perform is called. Let me know if this is an issue, I could rework perform to accept the entry itself.

Remaining TODOs:

kcbanner commented 10 months ago

Just pushed a change fixing the race condition mentioned in the PR description, see the commit message for details. Moving this out of draft status now.

mitchellh commented 10 months ago

Looks great. We're getting regular errors (retried a couple times) in CI. Any thoughts on those?

kcbanner commented 10 months ago

I don't see those locally, but it might be because the process tree in the CI is running under a job object itself - probably to control resource limits. I'll see if I can reproduce the same conditions locally and rework the logic to test if the process is already in a job object, and use that one instead.

mitchellh commented 10 months ago

I don't see those locally, but it might be because the process tree in the CI is running under a job object itself - probably to control resource limits. I'll see if I can reproduce the same conditions locally and rework the logic to test if the process is already in a job object, and use that one instead.

Thanks. Also happy to reconfigure CI if you have a way to do that. Either way, I'd love for the CI to pass. 😄

kcbanner commented 10 months ago

I'm not sure until this CI run happens, but I think the issue was trying to set the JOB_OBJECT_LIMIT_SILENT_BREAKAWAY_OK on the job - this would have let child processes of the child process spawned during the test not be part of their parent job - which depending on the settings of the parent job, may not be allowed.

I did this initially because it would avoid having to handle (and ignore) events from any subprocesses. I reworked the logic to not set this, and just check the PIDs instead.

mitchellh commented 10 months ago

Awesome. Thank you!