ocaml-multicore / eio

Effects-based direct-style IO for multicore OCaml
Other
559 stars 72 forks source link

Signal handlers may not run, especially with eio_linux #732

Open talex5 opened 6 months ago

talex5 commented 6 months ago

Since #728, we submit requests and wait for replies in a single system call. If a signal is received while waiting in io_uring_enter but after some requests were accepted, it returns the number of requests submitted rather than EINTR. This causes liburing to retry without returning to OCaml, which means that the OCaml part of the signal handler doesn't run.

This particular problem could be fixed by modifying liburing. However, there is another race affecting system calls more generally (https://github.com/ocaml/ocaml/issues/13189) and fixing that would also fix this as a side-effect.

This bug can cause Lwt_eio's tests to hang, as Lwt doesn't get the SIGCHLD signal it is expecting (Eio doesn't suffer from this problem itself because it uses process FDs to wait for child processes instead). We've also had similar problems with other libraries in the past (e.g. #400).