lsof-org / lsof

LiSt Open Files
https://lsof.readthedocs.io
Other
426 stars 106 forks source link

[BUG] lsof tests fail with kernel 6.9 #317

Open jirislaby opened 6 months ago

jirislaby commented 6 months ago

As described in

kernel commit cb12fd8e0dabb9a1c8aef55a6a41e2c255fcdf4b Author: Christian Brauner [brauner@kernel.org](mailto:brauner@kernel.org) Date: Mon Feb 12 16:32:38 2024 +0100

pidfd: add pidfs

broke lsof tests:

FAIL: lib/dialects/linux/tests/case-20-pidfd-pid.bash
=====================================================

p7397
f3
npidfd

p7397 f3 npidfd

FAIL lib/dialects/linux/tests/case-20-pidfd-pid.bash (exit status: 1)

Without the commit, lsof shows:

systemd      ... 59 [pidfd:899]

With the commit:

systemd      ... 1187 pidfd

I am not sure whether this is a kernel or lsof fault...

jirislaby commented 6 months ago

This works around the issue: https://github.com/lsof-org/lsof/blob/0a979c5640e277aa403a4ffce48743b6a72bd033/lib/dialects/linux/tests/case-20-pidfd-pid.bash#L14

--- a/lib/dialects/linux/tests/case-20-pidfd-pid.bash
+++ b/lib/dialects/linux/tests/case-20-pidfd-pid.bash
@@ -11,7 +11,8 @@ $TARGET | (
         exit 77
     fi
     line=$($lsof -p $pid -a -d $fd -F pfn| tr '\n' ' ')
-    if ! fgrep -q "p${pid} f${fd} n[pidfd:$pid]" <<<"$line"; then
+    if ! fgrep -q "p${pid} f${fd} n[pidfd:$pid]" <<<"$line" &&
+       ! fgrep -q "p${pid} f${fd} npidfd" <<<"$line"; then
        $lsof -p $pid -a -d $fd -F pfn
        echo
        echo $line
jiegec commented 6 months ago

Thanks, please submit a pull request. I will test on different kernel versions.

jiegec commented 6 months ago

Ah, I see where is the problem:

Before:

$ ls /proc/PID/fd/
total 0
lrwx------ 1 jiegec jiegec 64 May 17 14:52 0 -> /dev/pts/0
lrwx------ 1 jiegec jiegec 64 May 17 14:52 2 -> /dev/pts/0
lrwx------ 1 jiegec jiegec 64 May 17 14:52 3 -> 'anon_inode:[pidfd]'

After:

$ ls /proc/PID/fd/
total 0
lrwx------ 1 jiegec jiegec 64 May 17 14:49 0 -> /dev/pts/0
lrwx------ 1 jiegec jiegec 64 May 17 14:49 2 -> /dev/pts/0
lrwx------ 1 jiegec jiegec 64 May 17 14:49 3 -> 'pidfd:[8555]'*
jirislaby commented 6 months ago

Likely, see also: https://lore.kernel.org/all/20240515-anklopfen-ausgleichen-0d7c220b16f4@brauner/ which did not help. It should mimic the original anon_inode:[pidfd].

jiegec commented 6 months ago

Likely, see also: https://lore.kernel.org/all/20240515-anklopfen-ausgleichen-0d7c220b16f4@brauner/ which did not help. It should mimic the original anon_inode:[pidfd].

I am not sure what's the best solution here:

  1. Only change the tests, but the output remains between different Linux kernel versions
  2. Detect pidfd and mimic the old output for Linux 6.9 in lsof
  3. Wait until linux 6.10 to change the string in procfs
jirislaby commented 6 months ago

Likely, see also: https://lore.kernel.org/all/20240515-anklopfen-ausgleichen-0d7c220b16f4@brauner/ which did not help. It should mimic the original anon_inode:[pidfd].

It does:

# ll /proc/984/fd
total 0
lrwx------ 1 xslaby users 64 May 17 09:00 0 -> /dev/pts/1
lrwx------ 1 xslaby users 64 May 17 09:00 2 -> /dev/pts/1
lrwx------ 1 xslaby users 64 May 17 09:00 3 -> anon_inode:[pidfd]

But lsof -F n output shows anon_inode in the output. Instead of pidfd:PID. Where does this come from?

BTW util-linux had to adapt too, not sure if lsof needs to: https://github.com/util-linux/util-linux/pull/2866

jiegec commented 6 months ago

Likely, see also: https://lore.kernel.org/all/20240515-anklopfen-ausgleichen-0d7c220b16f4@brauner/ which did not help. It should mimic the original anon_inode:[pidfd].

It does:

# ll /proc/984/fd
total 0
lrwx------ 1 xslaby users 64 May 17 09:00 0 -> /dev/pts/1
lrwx------ 1 xslaby users 64 May 17 09:00 2 -> /dev/pts/1
lrwx------ 1 xslaby users 64 May 17 09:00 3 -> anon_inode:[pidfd]

But lsof -F n output shows anon_inode in the output. Instead of pidfd:PID. Where does this come from?

According the code, it checks if it is a regular file before detecting anon_inode, however in Linux 6.9 it is a regular file, so the logic for pidfd is not executed.

jiegec commented 6 months ago

I have implemented the second option

Likely, see also: https://lore.kernel.org/all/20240515-anklopfen-ausgleichen-0d7c220b16f4@brauner/ which did not help. It should mimic the original anon_inode:[pidfd].

I am not sure what's the best solution here:

1. Only change the tests, but the output remains between different Linux kernel versions

2. Detect pidfd and mimic the old output for Linux 6.9 in lsof

3. Wait until linux 6.10 to change the string in procfs

I have implemented the second option in #319

jirislaby commented 6 months ago
  1. Wait until linux 6.10 to change the string in procfs

This is not enough, the complete fix:

--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -2026,7 +2026,6 @@ static struct dentry *prepare_anon_dentry(struct dentry **stashed,
        }

        /* Notice when this is changed. */
-       WARN_ON_ONCE(!S_ISREG(inode->i_mode));
        WARN_ON_ONCE(!IS_IMMUTABLE(inode));

        dentry = d_alloc_anon(sb);
diff --git a/fs/pidfs.c b/fs/pidfs.c
index a63d5d24..f51a794f 100644
--- a/fs/pidfs.c
+++ b/fs/pidfs.c
@@ -201,10 +201,7 @@ static const struct super_operations pidfs_sops = {

 static char *pidfs_dname(struct dentry *dentry, char *buffer, int buflen)
 {
-       struct inode *inode = d_inode(dentry);
-       struct pid *pid = inode->i_private;
-
-       return dynamic_dname(buffer, buflen, "pidfd:[%llu]", pid->ino);
+       return dynamic_dname(buffer, buflen, "anon_inode:[pidfd]");
 }

 static const struct dentry_operations pidfs_dentry_operations = {
@@ -217,6 +214,7 @@ static int pidfs_init_inode(struct inode *inode, void *data)
 {
        inode->i_private = data;
        inode->i_flags |= S_PRIVATE;
+       inode->i_mode &= ~S_IFREG;
        inode->i_mode |= S_IRWXU;
        inode->i_op = &pidfs_inode_operations;
        inode->i_fop = &pidfs_file_operations;

Let's see how they fix it in the end.

brauner commented 5 months ago

For now we're mimicking the old pidfd behavior but we will try to flip the switch to the new format:

lrwx------ 1 root root 64 May 22 15:23 22 -> 'pidfd:[818]'

where 818 is a unique inode number unrelated to the pid number and st_mode will be set to S_IFREG.

So I really appreciate the change to accomodate pidfs. Fwiw, both lsfd in util-linux as well as strace have adapted to this already. I didn't catch lsof unfortunately.

jiegec commented 5 months ago

For now we're mimicking the old pidfd behavior but we will try to flip the switch to the new format:

lrwx------ 1 root root 64 May 22 15:23 22 -> 'pidfd:[818]'

where 818 is a unique inode number unrelated to the pid number and st_mode will be set to S_IFREG.

So I really appreciate the change to accomodate pidfs. Fwiw, both lsfd in util-linux as well as strace have adapted to this already. I didn't catch lsof unfortunately.

Thanks for your hard work on Linux kernel. I am okay with all formats (the legacy one, the current one and maybe a future one), and happy to accommodate them in lsof.