Closed tombl closed 1 month ago
Would it be possible to write a test that triggers a queue overflow?
Theoretically, yes. One can change the limit value of the queue length through file /proc/sys/fs/fanotify/max_queued_events
, so we can set the limit to a small number to trigger the overflow,
but this change is system-wide, i.e., it will affect other processes that are using the fanotify API. For CI, I think running such a test is fine, but testing this on contributors's machine would be kinda bad?
Would it be possible to write a test that triggers a queue overflow?
Theoretically, yes. One can change the limit value of the queue length through file
/proc/sys/fs/fanotify/max_queued_events
, so we can set the limit to a small number to trigger the overflow, but this change is system-wide, i.e., it will affect other processes that are using the fanotify API. For CI, I think running such a test is fine, but testing this on contributors's machine would be kinda bad?
And is the original value impractically high to reach?
And is the original value impractically high to reach?
Ahh, I think don't need to bother with that config file, the default value is 16384, it should be easy to reach it within 1-2 seconds with one thread repeatedly generating events
Ok, I tried to write a test for this issue today, now I kinda think it is not possible to do that.
We want the queue to be overflowed, so we have to trigger at least 16385 events, which should be easy with one thread looping reading files, however, Linux kernel would merge events that come from the same process (or thread if FAN_REPORT_TID
is specified in fanotify_init(2)
), so even though we can generate 16385 events using 1 thread, the Linux kernel will merge them into 1 event.
Then what about using 2 threads to make them access files alaternatively, then the events in the queue would theoretically be something like:
[from_worker1, from_worker2, from_worker1, from_worker2, ...]
Well, this won't work either, Linux kernel will reorder and coalesce them.
To not let Linux kernel merge events, we have 2 approaches:
Create 16385 events that come from 16385 unique threads, which is not practical as
Use permission event rather than notification event
Permission event won't be merged but it requires the listener to write a response back to the thread or this thread will be blocked forever, then it looks quite similar to the approach 1, we need tons of threads, which is not feasible.
Gentle ping on the PR author @tombl, do you have any idea on the test? :)
I've also been struggling to write a test. I hit this behavior in my application that uses fanotify
, but that was with >16384 unique processes making changes. I think spawning 16385 processes that make changes would work, but seems excessive. I also think the spawn-and-exit trick with threads might be enough.
I also think the spawn-and-exit trick with threads might be enough.
Yeah, I reproduced the overflow issue with 16385 threads, but can we stop Linux kernel from reusing thread IDs?
I don't think it does reuse the thread IDs except when they overflow /proc/sys/kernel/pid_max
.
It should only be an issue if someone lowers their pid_max
below their max_queued_events
.
What does this PR do
fanotify
returns events with the fd set toFAN_NOFD
when the queue overflows. Trying to close it panics withEBADF
, so don't bother.Checklist:
CONTRIBUTING.md