ocaml-multicore / eio

Effects-based direct-style IO for multicore OCaml
Other
548 stars 66 forks source link

POSIX backend keeps polling fd after close and ignores POLLNVAL #572

Closed quernd closed 1 year ago

quernd commented 1 year ago

We've come across an issue where the POSIX backend polls a file descriptor after it has been closed. The kernel returns POLLNVAL (invalid fd) but Eio ignores it and keeps polling. It manifests in strace like this:

write(5, "\27\3\3\0\"vl.9)\310\313\3369V\212\246^v\332\1\342m\372\267t\310W\327\360(#"..., 39) = 39
write(5, "\27\3\3\0\23\0004\354~CG\361\255 \252U\304\23\7\270\25f\10,", 24) = 24
close(5)                                = 0
ppoll([{fd=-1}, {fd=-1}, {fd=-1}, {fd=3, events=POLLIN}, {fd=-1}, {fd=5, events=POLLIN}], 6, NULL, NULL, 8) = 1 ([{fd=5, revents=POLLNVAL}])
ppoll([{fd=-1}, {fd=-1}, {fd=-1}, {fd=3, events=POLLIN}, {fd=-1}, {fd=5, events=POLLIN}], 6, NULL, NULL, 8) = 1 ([{fd=5, revents=POLLNVAL}])

and it keeps ppolling in an infinite loop. (This also seems to trigger a lot of GC activity as seen if you run it with OCAMLRUNPARAM=v=4095.)

This is a straightforward reproduction: https://github.com/quernd/eio-issue-repro

Unfortunately it relies on networking but we've found it to be reliably reproducible.

I haven't investigated why the fd is still being polled after it has been closed but I will submit a small PR to handle POLLNVAL soon, which at least mitigates the issue.

talex5 commented 1 year ago

Thanks for the test-case - that was really useful!