Open mihalicyn opened 1 year ago
https://lore.kernel.org/lkml/ccf6cea1-1139-cd73-c4e5-dc9799708bdd@living180.net/T/
It quite suspiciously reminds me of this issue. Long story short, multishot poll should only be used with nonblocking files. I assumed Christian was going to fix it up, apparently didn't happen. I'd encourage you to submit a patch to LXC.
making terminals FDs non-blocking (without disabling multishot mode) makes reproductions rarer (https://github.com/lxc/lxc/commit/715fb4effaa7f35e642056d2a10d5b8c393bfac0)
I tested it back then, worked for me. Is there a chance there is yet another blocking file you missed? It might also be a separate problem.
Hi Pavel,
https://lore.kernel.org/lkml/ccf6cea1-1139-cd73-c4e5-dc9799708bdd@living180.net/T/
It quite suspiciously reminds me of this issue. Long story short, multishot poll should only be used with nonblocking files. I assumed Christian was going to fix it up, apparently didn't happen. I'd encourage you to submit a patch to LXC.
yes, I came to the same conclusion during debugging and committed this patch (https://github.com/lxc/lxc/commit/715fb4effaa7f35e642056d2a10d5b8c393bfac0). It makes reproduction rarer.
I tested it back then, worked for me. Is there a chance there is yet another blocking file you missed? It might also be a separate problem.
I'm almost sure that I've covered all places, but I'll check again. As reproduction became rare after making FDs non-blocking it makes harder to debug what's happening. But my understanding is that we are getting into the infinite loop when poll CQE arrives, we go to read (and get something), then a new poll CQE arrived, we go to read (get nothing), new poll CQE arrives, we go to read (get nothing) and it's infinite sequence. My suspicion was that for some reason read wakes POLLIN for ttys... but I've checked git log
for TTYs discipline driver and ptmx and found nothing suspicious.
Kind regards, Alex
yes, I came to the same conclusion during debugging and committed this patch (lxc/lxc@715fb4e). It makes reproduction rarer.
Ah, right, and thank you for finally fixing this one!
I tested it back then, worked for me. Is there a chance there is yet another blocking file you missed? It might also be a separate problem.
I'm almost sure that I've covered all places, but I'll check again. As reproduction became rare after making FDs non-blocking it makes harder to debug what's happening. But my understanding is that we are getting into the infinite loop when poll CQE arrives, we go to read (and get something), then a new poll CQE arrived, we go to read (get nothing), new poll CQE arrives, we go to read (get nothing) and it's infinite sequence. My suspicion was that for some reason read wakes POLLIN for ttys... but I've checked
git log
for TTYs discipline driver and ptmx and found nothing suspicious.
Ok, so it's busy looping doing nothing useful. We need to understand what wakes it up. Can you run bpftace for 10-20s after that state of nothingness kicks in? This one should do:
bpftrace -e 'kprobe:io_poll_wake { @[kstack] = count(); }'
This should do, you can specify pid and output log file with -p
and -o
respectively.
Any news on that one?
Hi Pavel,
I'm really sorry for long delay with reply. I've just missed a notification and then forget to back to the issue. After we have landed this https://github.com/lxc/lxc/pull/4304 and stopped using a multi shot mode polling everything works just fine.
So, I don't think that there is any issue from the io_uring side. If I found anything interesting in there or if I see any extra issues I'll definitely go and do some kernel debugging and tracing by myself and then contact you with some insights ;-)
I think we can close this for now.
Sorry for taking your time!
Kind regards, Alex
Dear colleagues,
I'm trying to debug a problem with spurious poll events when multishot mode (
IORING_POLL_ADD_MULTI
) is enabled.Unfortunately, I haven't managed to make a minimal reproducer for the issue yet.
Currently, it's reproducible with LXC when it's built with
-Dio-uring-event-loop=true
. Reproducer itself is fairly easy:./build/src/lxc/tools/lxc-start -F test_ct -l TRACE -o lxcstart.log
just enter a few symbols in the terminal (or press Enter a few times) and it'll stuck forever.A stuck reason is that LXC process gets poll CQE then performs read from FD, just after read it gets a new CQE to the same FD, goes to read it and stuck forever.
I've started an investigation of this problem from the userspace (LXC) side and found that:
IORING_POLL_ADD_MULTI
solves the problem (https://github.com/lxc/lxc/commit/7fd671dbce98d139e52e8c4266f1050ef49ea8af)It was a hint for me and I've started digging the problem on the kernel side. The main point of interest in the kernel is function
io_poll_check_events
:I found a strange thing, sometimes after
req->cqe.res = 0
, on the next iterationvfs_poll
isn't called. It seemed strange to me and I found thatreq->cqe.res
is changed to non-zero value in between of the iteration from theio_poll_wake
function.It's worth to mention that problem happens with the
tty_fops
andptmx_fops
FDs, but we also use signalfd and socket FDs in the same io_uring instance.The purpose of filling this issue is to share my experience (maybe someone else met this problem too) and to get a debugging advice. :)
Kind regards, Alex