lsof-org / lsof

LiSt Open Files
https://lsof.readthedocs.io
Other
421 stars 105 forks source link

LTlock fails on btrfs #152

Open nckx opened 3 years ago

nckx commented 3 years ago

Hi!

The following test failure happens only when building on btrfs. Since everyone can easily reproduce that, and I'm clueless about lsof internals, I'll just leave this here:

Optional tests:
LTbigf ... OK
LTdnlc ... /tmp/guix-build-lsof-4.94.0.drv-0/lsof-4.94.0-checkout/tests found: 100.00%
           OK
LTlock ... lock mismatch: expected W, got " "
           lock mismatch: expected R, got " "
           lock mismatch: expected w, got " "
           lock mismatch: expected r, got " "
Failed tests: 1

make: *** [Makefile:118: opt] Error 1
command "make" "standard" "optional" failed with status 2
masatake commented 3 years ago

Which distribution do you use?

nckx commented 3 years ago

The bug isn't distribution-specific, but I use & first encountered it on Guix System.

masatake commented 3 years ago

The bug isn't distribution-specific, but I use & first encountered it on Guix System.

Thank you. I was confused with btrfs with zfs. I will inspect this issue.

masatake commented 3 years ago

Do you use btfs as your root file system? I would like to know the detail of the way to reproduce.

nckx commented 3 years ago

Do you use btfs as your root file system?

Yes.

I would like to know the detail of the way to reproduce.

Sure! But note that / or not doesn't matter: simply create any btrfs file system and unpack & build lsof on it. Something like:

dd if=/dev/zero of=/tmp/img bs=1M count=512
mkfs.btrfs /tmp/img
mount -o loop /tmp/img /mnt
cd /mnt
git clone https://github.com/lsof-org/lsof
cd lsof
./Configure linux
make
cd tests
make opt
# LTlock should fail with the message above
nckx commented 3 years ago

By the way, @masatake, the above is from memory, and was tried on an Ubuntu box (just to make sure the distro really doesn't matter after telling you it shouldn't :wink: ).

I don't have access to it now but if you still can't reproduce it, I can copy/paste my exact bash history tomorrow. It will be almost identical to what I wrote.

Thanks for responding so promptly!

nckx commented 3 years ago

This report is about something else, but seems to exhibit the same failure: https://github.com/lsof-org/lsof/issues/61

masatake commented 3 years ago

I found strange things about device minor number(s).

findmnt reports the minor number of mount point of btrfs is 67. stat reports the minor number of a file under the mount point of btrfs is 68. /proc/locks says the minor number of a locked file under the mount point of btrfs is 0x43 (== 67).

I guess lsof is confused with the mismatch of numbers.

[jet@living]/mnt% findmnt  -oTARGET,SOURCE,FSTYPE,MAJ:MIN /mnt
findmnt  -oTARGET,SOURCE,FSTYPE,MAJ:MIN /mnt
TARGET SOURCE     FSTYPE MAJ:MIN
/mnt   /dev/loop1 btrfs    0:67 
[jet@living]/mnt% stat /mnt/ /mnt/lsof/
stat /mnt/ /mnt/lsof/
  File: /mnt/
  Size: 8           Blocks: 32         IO Block: 4096   directory
Device: 44h/68d Inode: 256         Links: 1
Access: (0777/drwxrwxrwx)  Uid: (    0/    root)   Gid: (    0/    root)
Context: system_u:object_r:unlabeled_t:s0
Access: 2021-06-21 05:56:49.675987656 +0900
Modify: 2021-06-21 05:56:24.801910504 +0900
Change: 2021-06-21 05:56:24.801910504 +0900
 Birth: 2021-06-21 05:56:24.000000000 +0900
  File: /mnt/lsof/
  Size: 1196        Blocks: 0          IO Block: 4096   directory
Device: 44h/68d Inode: 257         Links: 1
Access: (0755/drwxr-xr-x)  Uid: ( 1000/     jet)   Gid: ( 1000/     jet)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2021-06-21 06:27:15.316102351 +0900
Modify: 2021-06-21 06:27:12.934091866 +0900
Change: 2021-06-21 06:27:12.934091866 +0900
 Birth: 2021-06-21 05:56:24.801910504 +0900
object_r:unlabeled_t:s0
Access: 2021-06-21 06:25:43.109696511 +0900
Modify: 2021-06-21 06:25:43.109696511 +0900
Change: 2021-06-21 06:25:43.109696511 +0900
 Birth: 2021-06-21 06:25:43.109696511 +0900
[jet@living]/mnt% stat /mnt/lsof/tests/config.LTlock2630449 
stat /mnt/lsof/tests/config.LTlock2630449 
  File: /mnt/lsof/tests/config.LTlock2630449
  Size: 2048        Blocks: 8          IO Block: 4096   regular file
Device: 44h/68d Inode: 842         Links: 1
Access: (0600/-rw-------)  Uid: ( 1000/     jet)   Gid: ( 1000/     jet)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2021-06-21 06:25:43.109696511 +0900
Modify: 2021-06-21 06:25:43.109696511 +0900
Change: 2021-06-21 06:25:43.109696511 +0900
 Birth: 2021-06-21 06:25:43.109696511 +0900
[jet@living]/mnt% grep 842 /proc/locks 
grep 842 /proc/locks 
17: POSIX  ADVISORY  WRITE 2630449 00:43:842 0 EOF
masatake commented 3 years ago

I guess this is something to do with the subvolume feature of btrfs.

masatake commented 3 years ago

As the test case shows lsof cannot report the lock information (RW) for a locked file on btrfs correctly.

masatake commented 3 years ago

https://www.spinics.net/lists/linux-btrfs/msg09044.html

mjt0k commented 11 months ago

See also https://unix.stackexchange.com/questions/345220/btrfs-how-to-get-real-device-id http://www.sabi.co.uk/blog/21-two.html?210804#210804 (linked to from above) https://github.com/util-linux/util-linux/issues/2349#issuecomment-1615854709 (similar issue with lsfd)