Open arko-pl opened 5 months ago
Hello Arkadiusz,
"Arkadiusz B.":
mkdir /union/mnt/new
mount -o bind /union /union/mnt/new
mount -o remount /union/mnt/new
after last command console is locked.
I tried the same command sequnece on vanilla 6.1.0 + my current aufs6.1, but could not reproduce. I will try again with closer environtmen to yours later.
And did you install aufs-util.git on your system?
J. R. Okajima
And did you install aufs-util.git on your system?
I got outdated version, but after update I'm also reproducing this issue.
"Arkadiusz B.":
And did you install aufs-util.git on your system?
I got outdated version, but after update I'm also reproducing this issue.
I tried reproducing again on plain linux-v6.1.92 + aufs6.1, but could not reproduce the problem.
The kernel log in your first mail shows
For snmpd[21963] and kbtest[9102], it is understandable that they stopped working since mount[16359] is still running. Then why mount[16359] stopped working? It was warned as the semaphore owner is different, and then stopped working. But the stopped point is ahead from the first warning, which means after the first warning mount[16359] kept running and moved ahead. And then stopped with holding a semaphore. Is this the scenario? Really? I don't understand.
Hmm, I don't know why the semaphore owner is different. It has to be same. In the remount process, aufs au_fsctx_reconfigure() function acquires the semaphore and au_remount_refresh() (in the same process) releases the semaphore temporary. I guess LOCKDEP produces the warning here. It is weird.
Did you apply some other patches to your kernel? I guess you applied lockdep-debug.patch in aufs-standalone.git. If you didn't, the kernel build should fail, or another warning would be produced much earlier and LOCKDEP would be off.
J. R. Okajima
Did you apply some other patches to your kernel?
No, only the standard set of patches was applied. Here is how I'm reproducing this from scratch:
mkdir /union
mkdir /mnt/image
dd if=/dev/zero of=/tmp/image bs=1M count=8
mkfs.ext4 /tmp/image
mount -o loop /tmp/image /mnt/image
mount -t aufs -o br:/mnt/image=rw none /union
mkidr /union/usr
cp /bin/bash /union/usr/
mkdir -p /union/mnt/test
mount -o remount /union/mnt/test
"Arkadiusz B.":
Here is how I'm reproducing this from scratch:
mkdir /union dd if=/dev/zero of=/tmp/image bs=1M count=8 mkfs.ext4 /tmp/image mount -o loop /tmp/image /mnt/image mount -t aufs -o br:/mnt/image=rw none /union mkidr /union/usr cp /bin/bash /union/usr/ mkdir -p /union/mnt/test mount -o remount /union/mnt/test
Hmm, that is really strange. The last "remount" should return an error saying "that is not a mount point" or something. Did you forget mount -o bind /union /union/mnt/test just before the last "remount"?
Anyway I tried using loopback mounted ext4 on v6.1.92 again, but could not reproduce.
Something is totally broken. I don't think it sane that the semaphore onwer changes silently. I'm not sure whether this would help us or not, what will happen if you try mount -o move /union /union/mnt/test and "remount" instead of "bind"?
J. R. Okajima
Did you forget mount -o bind /union /union/mnt/test just before the last "remount"?
Sorry, my "enter" key got stuck, sent the message and closed the ticket... This is correct set of commands:
mkdir -p /mnt/image
mkdir /union
dd if=/dev/zero of=/tmp/image bs=1M count=8
mkfs.ext4 /tmp/image
mount -o loop /tmp/image /mnt/image
mount -t aufs -o br:/mnt/image=rw none /union
mkidr /union/usr
cp /bin/bash /union/usr/
mkdir -p /union/mnt/test
mount -o bind /union/usr /union/mnt/test
mount -o remount /union/mnt/test
mount -o move
doesn't work it ends with "invalid argument".
There is one more thing to add. The aufs is mounted at the initrd stage where busybox is used. But I'm also reproducing on running system without busybox.
"Arkadiusz B.":
Sorry, my "enter" key got stuck, sent the message and closed the ticket... This is correct set of commands:
I see. I guess "bind"ing a subdir onto another (sub)subdir is the key. I was truing binding the ROOT dir onto a subdir (and failed reproducing). Now I will think about why older aufs could handle it. Give me some time.
J. R. Okajima
Great, thank you.
I guess "bind"ing a subdir onto another (sub)subdir is the key. I was truing binding the ROOT dir onto a subdir (and failed reproducing). Now I will think about why older aufs could handle it.
In aufs5.10, aufs hired fs_context in mainline and it passes a different dentry (the command line parameter, /union/mnt/test in your case). Aufs should not trust it is the root dentry in aufs super_block, I think. Would you test this patch?
J. R. Okajima
diff --git a/fs/aufs/fsctx.c b/fs/aufs/fsctx.c index 43b21910bc67..73d0cbe5b2c9 100644 --- a/fs/aufs/fsctx.c +++ b/fs/aufs/fsctx.c @@ -47,6 +47,7 @@ static int au_fsctx_reconfigure(struct fs_context *fc)
root = fc->root;
sb = root->d_sb;
Would you test this patch? I tested it and could not reproduce issue anymore. Thank you :+1:
"Arkadiusz B.":
I tested it and could not reproduce issue anymore. Thank you :+1:
Thanx for testing. The patch will be merged after a few weeks. I'm testing several kernel versions now to release on next Monday. It is not for this "root" dentry bug, but another AIO bug. The fix for this "root" dentry bug will be released after that.
J. R. Okajima
Great news. Thank you.
------- Blind-Carbon-Copy
From: "J. R. Okajima" @.> To: @. Subject: aufs6 GIT release (v6.10-rc5), aufs6.1--aufs6.5 will end MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: @.> Date: Mon, 01 Jul 2024 06:20:39 +0900 Message-ID: @.>
o Bugfix
o News
J. R. Okajima
------- End of Blind-Carbon-Copy
Hello,
I noticed a deadlock while trying to remount nested and bound mountpoint. It is reproducible with different underlying filesystems. Steps to reproduce (assuming
/union
is a simple aufs r/w mount point):mkdir /union/mnt/new
mount -o bind /union /union/mnt/new
mount -o remount /union/mnt/new
after last command console is locked.At the time of writing last kernel that works is 5.4. Issue is observed also on 5.15 LTS kernel and 6.1 LTS kernel.
There is dump of tasks from the 6.1.92 kernel: