Open lukts30 opened 2 years ago
Replace the self
with pid
in the output and you can get the real path.
Do you mean that I can do that form within an ebpf program? Substituting self with the PID and then doing a realpath from userspace is too racy since the file descriptor gets recycled immediately after the mount.
What is your use case ?
I was trying to understand what mount calls podman+crun make during container startup. As I figured out it does a few dozens openat2 and then uses the file descriptor via the procfs path in the mount call.
for path in paths do:
fd=openat(path)
mount(..., "/proc/self/fd/N, ..., ..., ...)
close(fd)
If I tried a realpath(/proc/self/fd/9) from userspace it would most likely point to a different path since after the first iteration the kernel is free to reassign file descriptor 9.
For this specific use case, the snippet below does work since I know that file descriptor originates from an openat2 call but there are ways to acquire an fd without even passing the path directly via an open call to the kernel (Unix domain sockets SCM_RIGHTS or pidfd_getfd).
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat2 { printf("open: %s %s ", comm, str(args->filename)); } tracepoint:syscalls:sys_exit_openat2 { printf("fd=%d\n", args->ret); } tracepoint:syscalls:sys_enter_mount { printf("%s mount %s\n", comm, str(args->dir_name)); }'
OK, I see.
I think you can create a new tool which combines the functionalities of opensnoop and mountsnoop.
I think you can create a new tool which combines the functionalities of opensnoop and mountsnoop.
But does opensnoop or any other tools handle externally acquired file descriptors (e.g. pidfd_getfd)? It seems likely that one could still correlate them but at least the fd would be different.
Process A: fd=open(path), open returns 9, Process A sends only the number 9 to Process B. Process B: pidfd_getfd(A,9), pidfd_getfd returns 7 and then does something that involves /proc/self/fd/7.
So, in this case, both A as well as B have file descriptors with different numbers but both refer to the same file in the kernel. Something like this is used by lxd/lxc to intercept syscalls from unprivileged processes and redo them in a supervisor process.
The only solution that covers all these scenarios would be to have something in ebpf that resolves symlinks.
You can trace both open and pidfd_getfd, and use a BPF map to store the path.
I tried debugging what mount call podman+crun makes during container creation. I tried using mountsnoop but that was not helpful since the mount calls involved
/proc/self/fd/N
.With strace:
I expect more tools in the future making use of the newer openat2 so it would be an improvement if mountsnoop could print the realpath but I do not know if there is bpf helper for realpath.