nix-rust / nix

Rust friendly bindings to *nix APIs
MIT License
2.57k stars 650 forks source link

fix: only close `fanotify` events with a valid fd #2399

Closed tombl closed 1 month ago

tombl commented 1 month ago

What does this PR do

fanotify returns events with the fd set to FAN_NOFD when the queue overflows. Trying to close it panics with EBADF, so don't bother.

Checklist:

SteveLauC commented 1 month ago

Would it be possible to write a test that triggers a queue overflow?

Theoretically, yes. One can change the limit value of the queue length through file /proc/sys/fs/fanotify/max_queued_events, so we can set the limit to a small number to trigger the overflow, but this change is system-wide, i.e., it will affect other processes that are using the fanotify API. For CI, I think running such a test is fine, but testing this on contributors's machine would be kinda bad?

asomers commented 1 month ago

Would it be possible to write a test that triggers a queue overflow?

Theoretically, yes. One can change the limit value of the queue length through file /proc/sys/fs/fanotify/max_queued_events, so we can set the limit to a small number to trigger the overflow, but this change is system-wide, i.e., it will affect other processes that are using the fanotify API. For CI, I think running such a test is fine, but testing this on contributors's machine would be kinda bad?

And is the original value impractically high to reach?

SteveLauC commented 1 month ago

And is the original value impractically high to reach?

Ahh, I think don't need to bother with that config file, the default value is 16384, it should be easy to reach it within 1-2 seconds with one thread repeatedly generating events

SteveLauC commented 1 month ago

Ok, I tried to write a test for this issue today, now I kinda think it is not possible to do that.

We want the queue to be overflowed, so we have to trigger at least 16385 events, which should be easy with one thread looping reading files, however, Linux kernel would merge events that come from the same process (or thread if FAN_REPORT_TID is specified in fanotify_init(2)), so even though we can generate 16385 events using 1 thread, the Linux kernel will merge them into 1 event.

Then what about using 2 threads to make them access files alaternatively, then the events in the queue would theoretically be something like:

[from_worker1, from_worker2, from_worker1, from_worker2, ...]

Well, this won't work either, Linux kernel will reorder and coalesce them.

To not let Linux kernel merge events, we have 2 approaches:

  1. Create 16385 events that come from 16385 unique threads, which is not practical as

    1. That's too many threads (Linux has a system-wide threads limit that is related to the memory size, on my laptop (16GiB), the limit is 108685)
    2. What about we just create-and-drop the threads, Linux will reuse the thread IDs from the dropped threads.
  2. Use permission event rather than notification event

    Permission event won't be merged but it requires the listener to write a response back to the thread or this thread will be blocked forever, then it looks quite similar to the approach 1, we need tons of threads, which is not feasible.

SteveLauC commented 1 month ago

Gentle ping on the PR author @tombl, do you have any idea on the test? :)

tombl commented 1 month ago

I've also been struggling to write a test. I hit this behavior in my application that uses fanotify, but that was with >16384 unique processes making changes. I think spawning 16385 processes that make changes would work, but seems excessive. I also think the spawn-and-exit trick with threads might be enough.

SteveLauC commented 1 month ago

I also think the spawn-and-exit trick with threads might be enough.

Yeah, I reproduced the overflow issue with 16385 threads, but can we stop Linux kernel from reusing thread IDs?

tombl commented 1 month ago

I don't think it does reuse the thread IDs except when they overflow /proc/sys/kernel/pid_max.

It should only be an issue if someone lowers their pid_max below their max_queued_events.

SteveLauC commented 1 month ago

I don't think it does reuse the thread IDs except when they overflow /proc/sys/kernel/pid_max.

From this post, it indeed is, thanks for showing me this, TIL!