sfjro / aufs-standalone

27 stars 14 forks source link

Can't unmount union since 6.3.x #29

Closed fulalas closed 1 year ago

fulalas commented 1 year ago

Hi!

I noticed that since kernel 6.3.0 during the shutdown/reboot process the system can no longer unmount /union (on Porteus 5, which mounts in the boot the whole system using aufs). By reverting the kernel to any 6.2.x it can unmount /union properly. This has been confirmed by many users.

Now, I don't know if this is an aufs issue or a kernel (upstream) issue. What I know is that using Porteus 5 with OverlayFS (instead of aufs) this issue doesn't happen.

These are the commands Porteus uses to unmount everything in the shutdown/reboot process:

umount -nl /union/*
umount -nl /union/mnt/*
umount /union

The last (third) command is the one that fails in 6.3.x with the message Device or resource busy, while in 6.2.x it unmounts without any complains.

It's funny because using 6.3.x I can cd /union, execute rm -fr * and it removes everything just fine, but /union itself is locked and can't be removed or unmounted.

Any ideas?

Thanks once again for the hard work! :)

sfjro commented 1 year ago

Hello fulalas,

fulalas:

I noticed that since kernel 6.3.0 during the shutdown/reboot process the system can no longer unmount /union (on Porteus 5, which mounts in the boot the whole system using aufs). By reverting the kernel to any 6.2.x it can unmount /union properly. This has been confirmed by many users.

It means someone is using aufs, and it makes aufs busy (in use). So let's find out who is using what file in aufs. The first approach is "sudo lsof" which you might already tried.

So I'd suggest you to try a module parameter 'debug=1' and MagicSysrq key. The module parameter may not be so helpful for this case. But I hope MagicSysrq would help. By default, MagicSysrq + 'A' dumps all files aufs is using, and I hope it helps you to find out the process (and the file).

(from aufs manual)

.B sysrq=key Specifies MagicSysRq key for debugging aufs. You need to enable both of CONFIG_MAGIC_SYSRQ and CONFIG_AUFS_DEBUG. Currently this is for developers only. The default is a'. . .TP .B debug= 0 | 1 Specifies disable(0) or enable(1) debug print in aufs. This parameter can be changed dynamically. You need to enable CONFIG_AUFS_DEBUG. Currently this is for developers only. The default is0' (disable).

J. R. Okajima

peabee commented 1 year ago

Hello @fulalas and @sfjro from the Puppy Linux community. I'm not sure how many Puppy users are using the 6.3 kernel with aufs but there are certainly a few (3+ including me) and so far I haven't seen any problem reports. I'll PM this thread to the users I know about. We can use the Porteus kernel in Puppy so if I get time I'll see if that shows any problems. Regards PeaBee

peabee commented 1 year ago

Seems OK - no error messages on shutdown...... Screenshot

porteux commented 1 year ago

@sfjro, thanks for the tips. Here's what I have so far:

kernel 6.3.9 (just before unmounting everything to reboot/shutdown) + nvidia driver: image

image

image

And heaps more of screens like the ones above until this last one:

image

Now with kernel 6.2.11 + nvidia driver:

image

image

image

As you can see, for some reason kernel 6.3.x locks the whole system so the system can't unmount /union, while kernel 6.2.x doesn't.

As said, what's funny is that I can easily rm -fr /union/* (after unmounting its children, of course) and no single file is left, still I can't unmount it, and calling MagicSysrq (alt+print+a) prints the same long list of files in use.

In case you're wondering why I'm specifying '+ nvidia driver', that's because on my system if I don't use nvidia driver in 6.3.x it works perfectly:

image

Oh, in all scenarios lsof returns the same result: 4956537438524779273_121

We could blame nvidia but people on Porteus forum without nvidia card reported failure during /union unmount.

Question: when you updated your aufs project from 6.2.x to 6.3.x have you changed any part of the code apart from some adaptation to make it work on 6.3.x?

What's clear is that something has changed since 6.3.x. If it's the kernel or aufs or the combination of both or some mystical explanation, I don't know :D

Thanks once again!

sfjro commented 1 year ago

PorteuX:

@sfjro, thanks for the tips. Here's what I have so far:

kernel 6.3.9 (just before unmounting everything to reboot/shutdown) + nvidia driver: 4956537438524779273_121 :::

It shows that you have still many files opened in aufs.

Now with kernel 6.2.11 + nvidia driver:

Your last two lines are aufs: files aufs: done which means your files are all closed and nothing is in use.

As said, what's funny is that I can easily rm -fr /union/* (after unmounting its children, of course) and no single file is left, still I can't unmount it, and calling MagicSysrq (alt+print+a) prints the same long list of files in use.

Yes, removing the files never mean closing. The files are kept being in use.

In case you're wondering why I'm specifying '+ nvidia driver', that's because on my system if I don't use nvidia driver in 6.3.x it works perfectly:

That is a mystery for me because I don't know what 'nvidia driver' does.

Question: when you updated your aufs project from 6.2.x to 6.3.x have you changed any part of the code apart from some adaptation to make it work on 6.3.x?

Essentially I made only two commits.

8a331ddd0 2023-03-07 aufs: for v6.3-rc1, new header filelock.h 322e81b5f 2023-03-07 aufs: for v6.3-rc1, mnt_user_ns() is replaced by mnt_idmap()

You can see them by running 'git log aufs6.2..aufs6.3".

What's clear is that something has changed since 6.3.x. If it's the kernel or aufs or the combination of both or some mystical explanation, I don't know :D

Unfortunately I cannot dive into aufs now coz my ssd got broken, and I'm still struggling restoring my environment. One possible scenario is aufs has a bug around the file reference count, which is incremented when opened and decremented when closed. And when the reference counter reaches zero, then the file becomes not-in-use status. Your magic sysrq shows many files are still opened from aufs' point of view.

J. R. Okajima

fulalas commented 1 year ago

@sfjro, thank you for the support.

No rush. Whenever you have time to look at this it will be great. If we can help you with your environment, let us know. :)

I'm wondering, if this is a bug in aufs why it didn't show up in 6.2.x and lower versions? I looked at your 2 commits and they look OK and small, although I'm not an expert in anything related to kernel.

fulalas commented 1 year ago

Not sure if this helps, but the mount command used to load .xzm modules into the union is mount -no remount,add:1:"$MOD"=rr aufs /, where $MOD is pointing to the loop image previously mounted with mount -no loop,ro "$targetmod" "$MOD", where $targetmod is the module file path.

Also, not sure if this is related, but since util-linux 2.39 I can no longer mount .xzm modules into the union: https://github.com/util-linux/util-linux/issues/2309#issuecomment-1612771116

ncmprhnsbl commented 1 year ago

as far as i can see, this is to do with the util-linux mount regression. as seen in the util-linux#2309 issue the mount command : mount -no remount,add:1:"$MOD"=rr aufs / needs to be mount -no remount,add=1:"$MOD"=rr aufs / ... similarly, for the removal: mount -t aufs -o remount,del:$MOD aufs should be mount -t aufs -o remount,del=$MOD aufs
.. kernel version doesn't seem to matter. in my skimming of the manual(possibly not even the right version) the syntax/usage of remount,add/del seems a bit vague, perhaps some expansion could be useful in that area.

EDIT: ok :p on closer inspection, there's something else going on here causing 'unclean' unmounting on shutdown, with the newer kernels >6.3

sfjro commented 1 year ago

Hello,

ncmprhnsbl:

as far as i can see, this is to do with the util-linux mount regression. as seen in the util-linux#2309 the mount command : mount -no remount,add:1:"$MOD"=rr aufs / needs to be mount -no remount,add=1:"$MOD"=rr aufs / ... similarly, for the removal:mount -t aufs -o remount,del:$MOD aufsshould bemount -t aufs -o remount,del=$MOD aufs`
.. kernel version doesn't seem to matter.

You're right. For aufs, the change in util-linux is a regression. But I can understand the change and I won't ask util-linux to handle aufs differently.

The kernel version is problem in my environment only. The old kernel cannot be compiled by new compiler. This is a matter of my development environment.

in my skimming of the manual(possibly not even the right version) the syntax/usage of remount,add/del seems a bit vague, perhaps some expansion could be useful in that area.

Hmm, there are a few examples in EXAMPLE section. But I will add one or two.

J. R. Okajima

TurboBlaze commented 1 year ago

Hello @sfjro Any news?

Regards Blaze

sfjro commented 1 year ago

Hello,

TurboBlaze:

Hello @sfjro Any news?

For fulalas' original problem,aufs cannot be unmounted because of EBUSY, there is no progress on my side.

For ncmprhnsbl's post, util-linux(libmount)'s change, I made a workaround in mount.aufs(8) helper. And I am going to test now. But my development environment still suffers from my ssd damage. I'm still working, but the pace is very slow.

Here is the patch I wrote to follow the libmount's change.

J. R. Okajima

commit 0d80114bf831a7ac6fe9b9f8adc8657349c15b9c Author: J. R. Okajima @.***> Date: Fri Jul 21 09:27:01 2023 +0900

workaround for fsctx in util-linux 2.39

util-linux (libmount) v2.39 issues fsmount(2) families and it rejects
the aufs mount option using the "colon" syntax such as "br:rw:ro" and
"del:rw".
In order to make it keep working, mount.aufs(8) helper translates the
colon to equal sign, such like "br=rw:ro" and "del=rw".
This tranlation should always work regardless the version of
util-linux.

See-also: ***@***.***/msg05912.html
Reported-by: Thomas Wei schuh ***@***.***>
Signed-off-by: J. R. Okajima ***@***.***>

diff --git a/aufs.in.5 b/aufs.in.5 index 1e112c9..ace1973 100644 --- a/aufs.in.5 +++ b/aufs.in.5 @@ -55,6 +55,8 @@ whplink-dir(*[AUFS_WH_PLINKDIR]) if necessary . .TP .B br:BRANCH[:BRANCH ...] (dirs=BRANCH[:BRANCH ...]) +.TQ +.B br=BRANCH[:BRANCH ...] Adds new branches. (cf. Branch Syntax).

@@ -70,9 +72,16 @@ work correctly. By default (since linux-3.2 until linux-3.18-rc1), aufs prohibits such operation internally, but there left a way to do. (cf. Branch Syntax). + +If you use mount(8) from util-linux v2.39 and later, you cannot use +the colon (br:) and you have to use the equal sign (br=) instead. +But if you install aufs-util release 20230724 (or later), you can use +the colon too. . .TP .B [ add | ins ]:index:BRANCH +.TQ +.B [ add | ins ]=index:BRANCH Adds a new branch. The index begins with 0. Aufs creates @@ -95,9 +104,14 @@ If you want to update the contents of a process address space after adding, you need to restart your process or open/mmap the file again. .\" Usually, such files are executables or shared libraries. (cf. Branch Syntax). + +If you want to use the colon (add:), then you need to install +aufs-util release 20230724 or later. . .TP .B del:dir +.TQ +.B del=dir Removes a branch. Aufs does not remove whiteout-base(*[AUFS_WH_BASE]) and @@ -109,9 +123,14 @@ If a process is referencing the file/directory on the deleting branch (by open, mmap, current working directory, etc.), aufs will return an error EBUSY. In this case, a script aubusy' (in aufs\-util.git and aufs2\-util.git) is useful to identify which process (and which file) makes the branch busy. + +If you want to use the colon (del:), then you need to install +aufs\-util release 20230724 or later. . .TP .B mod:BRANCH +.TQ +.B mod=BRANCH Modifies the permission flags of the branch. Aufs creates or removes whiteout\-base(\*[AUFS_WH_BASE]) and/or @@ -127,14 +146,21 @@ Additionally when you enable CONFIG_IMA (in linux\-2.6.30 and later), IMA may produce some wrong messages. But this is equivalent when the filesystem is changedro' in emergency. (cf. Branch Syntax). + +If you want to use the colon (mod:), then you need to install +aufs-util release 20230724 or later. . .TP .B append:BRANCH +.TQ +.B append:BRANCH equivalent to add:(last index + 1):BRANCH'. (cf. Branch Syntax). . .TP .B prepend:BRANCH +.TQ +.B prepend=BRANCH equivalent toadd:0:BRANCH.' (cf. Branch Syntax). . diff --git a/mount.aufs.c b/mount.aufs.c index e085d4c..515be81 100644 --- a/mount.aufs.c +++ b/mount.aufs.c @@ -219,6 +219,45 @@ static int drop_level(int argc, char **argv, int idx) return 0; }

+/*

sfjro commented 1 year ago

Hi,

fulalas:

No rush. Whenever you have time to look at this it would be great. If we could help you with your environment, let us know. :)

I've reviewed aufs6.3 and found a suspicious code about the file refenrece count. Please try this patch and see if you can unmouont aufs cleanly. But I am not sure this is the cause of your problem. And the line number in the patch may be different from your source file. Please apply manually, if it is.

J. R. Okajima


diff --git a/mm/mmap.c b/mm/mmap.c index 61a4bede666e..8ff923ccfe2b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2863,21 +2865,21 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, if (vma->vm_flags & VM_LOCKED) flags |= MAP_LOCKED;

if IS_ENABLED(CONFIG_AUFS_FS)

vma_get_file(vma);
file = vma->vm_file;
prfile = vma->vm_prfile;
ret = do_mmap(vma->vm_file, start, size,
        prot, flags, pgoff, &populate, NULL);
fulalas commented 1 year ago

@sfjro, it seems your patch fixed the issue, yes! I didn't have time to make solid tests though. I'll let you know.

Thank you once again for the hard work!

sfjro commented 1 year ago

fulalas:

@sfjro, it seems your patch fixed the issue, yes! I didn't have time to make solid tests though. I'll let you know.

Thanx for the test. I'll merge and release the patch next Monday.

J. R. Okajima

fulalas commented 1 year ago

@sfjro, I did more tests and unfortunately your patch doesn't fix the issue in all scenarios -- in my case, it fails using Nvidia drivers.

Other people in Porteus forum also reported your patch didn't fix the issue for them.

So I guess we need to continue investigating.

Thanks!

sfjro commented 1 year ago

fulalas:

@sfjro, I did more tests and unfortunately your patch doesn't fix the issue in all scenarios -- in my case, it fails using Nvidia drivers.

Other people in Porteus forum also reported your patch didn't fix the issue for them.

Thanx for the report. I will try more.

J. R. Okajima

sfjro commented 1 year ago

fulalas:

So I guess we need to continue investigating.

Please try this one-liner patch. I'm still testing.

J. R. Okajima

diff --git a/mm/mmap.c b/mm/mmap.c index 90ab9002f976..1e286c19f9c9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2639,7 +2639,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,

sfjro commented 1 year ago

"J. R. Okajima":

Please try this one-liner patch. I'm still testing.

My local test is not bad, but it is doubtful that the patch will solve the problem.

Now I'm considering removing aufs[56]-mmap.patch entirely. The purpose of aufs[56]-mmap.patch is to show the correct path in /proc/PID/maps and the symlink target of /proc/PID/fd/N. It was necessary for some applications to work correctly. For instance, debian apt-get(1), if I remember correctly. And probably lsof(1) wants the correct path too.

If users and their applications, I mean the use-cases, allow the incorrect path, aufs[56]-mmap.patch can be removed.

J. R. Okajima

fulalas commented 1 year ago

My local test is not bad, but it is doubtful that the patch will solve the problem.

@sfjro, your guess is correct: this last patch unfortunately didn't fix the issue.

Now I'm considering removing aufs[56]-mmap.patch entirely.

I'm happy to test your last idea. How should I proceed? :)

sfjro commented 1 year ago

fulalas:

Now I'm considering removing aufs[56]-mmap.patch entirely.

I'm happy to test your last idea. How should I proceed? :)

Revert the patch by running $ for i in the_last_two_patches_I_sent

do patch -p1 -R < $i done $ patch -p1 -R < aufs6-standalone.git/aufs6-mmap.patch and then rebuild your kernel.

J. R. Okajima

fulalas commented 1 year ago

I thought you were saying that we could remove all patches made to mmap.c file. Well, I tried but the system didn't work properly -- I could not even load the GUI.

sfjro commented 1 year ago

fulalas:

I thought you were saying that we could remove all patches made to mmap.c file. Well, I tried but the system didn't work properly -- I could not even load the GUI.

Arg, I forgot to mention this one small patch.

J. R. Okajima

diff --git a/fs/aufs/file.h b/fs/aufs/file.h index 4ed41bb59d3d..7d2be2d7f619 100644 --- a/fs/aufs/file.h +++ b/fs/aufs/file.h @@ -317,12 +317,14 @@ static inline void au_vm_file_reset(struct vm_area_struct vma, static inline void au_vm_prfile_set(struct vm_area_struct vma, struct file *file) { +#if 0 get_file(file); vma->vm_prfile = file;

ifndef CONFIG_MMU

get_file(file);
vma->vm_region->vm_prfile = file;

endif

+#endif }

endif / KERNEL /

sfjro commented 1 year ago

fulalas:

I thought you were saying that we could remove all patches made to mmap.c file. Well, I tried but the system didn't work properly -- I could not even load the GUI.

Not only mm/mmap.c. aufs6-mmap.patch modifies several files.

If it is not easy for you to revert aufs6-mmap.patch, then I'd suggest you to try the approach to keep aufs6-mmap.patch.

J. R. Okajima

diff --git a/mm/mmap.c b/mm/mmap.c index a042cf64c9f0..90ab9002f976 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -593,7 +593,7 @@ static inline void vma_complete(struct vma_prepare *vp, if (vp->file) { uprobe_munmap(vp->remove, vp->remove->vm_start, vp->remove->vm_end);

fulalas commented 1 year ago

@sfjro, I'm always patching and building from scratch, so I'm always applying the extra patches manually. Would it be easier to have a special branch for testing so I can just pull from it? :)

BTW, this patch (the one before the last) prevents the kernel to be built.

fulalas commented 1 year ago

OK, good news! It seems it's fixed now! I applied all the patches in this thread, except this (because it breaks the build process), and the system is finally able to unmount everything now!

Gonna wait for more people from Porteus forum to confirm.

Nice work, man! Thanks a lot!

TurboBlaze commented 1 year ago

With @fulalas kernel I don't have anymore issue with unmount union at shutdowns/reboots in Porteus. https://i.imgur.com/Of7n4HB.png Thanks to @sfjro aka Junjiro Okajima for your hard work!

P.S. waiting a new aufs patch ;)

sfjro commented 1 year ago

TurboBlaze:

With @fulalas kernel I don't have anymore issue with unmount union at shutdowns/reboots.

Guys, thank you for the tests. The patch (one-liner) will be merged in the release on Monday (14 Aug).

J. R. Okajima

sfjro commented 1 year ago

------- Blind-Carbon-Copy

From: "J. R. Okajima" @.> To: @. Subject: aufs5 and aufs6 GIT release MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: @.> Date: Mon, 14 Aug 2023 01:45:43 +0900 Message-ID: @.>

o bugfix

o misc

(aufs-util.git)

J. R. Okajima

------- End of Blind-Carbon-Copy

fulalas commented 1 year ago

@sfjro, do you have any plans to create a new branch for 6.4.x and 6.5.x?

Thanks!

sfjro commented 1 year ago

fulalas:

@sfjro, do you have any plans to create a new branch for 6.4.x and 6.5.x?

Of course. Here is my plan.

J. R. Okajima

TurboBlaze commented 1 year ago

Not bad to see aufs support for kernel 6.5.x

peabee commented 1 year ago

As an experiment..... tried to apply aufs6.x-rcN to 6.5-rc7 .......... One failure to patch fs/splice.c from aufs6-base.patch patch -N -p1 < aufs6-base.patch patching file fs/splice.c Hunk #1 succeeded at 928 (offset 63 lines). Hunk #2 FAILED at 876. 1 out of 2 hunks FAILED -- saving rejects to file fs/splice.c.rej

code doesn't seem to exist anymore.... splice.c.txt attached splice.c.txt

@@ -876,9 +876,9 @@ static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
 /*
  * Attempt to initiate a splice from a file to a pipe.
  */
-static long do_splice_to(struct file *in, loff_t *ppos,
-            struct pipe_inode_info *pipe, size_t len,
-            unsigned int flags)
+long do_splice_to(struct file *in, loff_t *ppos,
+         struct pipe_inode_info *pipe, size_t len,
+         unsigned int flags)
 {
    unsigned int p_space;
    int ret;
peabee commented 1 year ago

6.5 has been released.... @sfjro would it be possible to say what changes you think will be needed to aufs6-base.patch for splice.c in 6.5? Many thanks

sfjro commented 1 year ago

PB:

@sfjro would it be possible to say what changes you think will be needed to aufs6-base.patch for splice.c in 6.5?

I am not reached to v6.5 yet. Wait a week or two plz.

J. R. Okajima

peabee commented 1 year ago

Many thanks for aufs-6.4 and aufs-6.5

Kernel 6.5.1 built with aufs-6.5 and seems fine.... :-))

fulalas commented 1 year ago

@sfjro, thanks a looot for the hard work. All recent branches are working flawlessly, including not only 6.3.x but also 6.4.x and 6.5.x. You're a hero! :)

I'm closing this issue.

sfjro commented 1 year ago

fulalas:

@sfjro, thanks a looot for the hard work. All recent branches are working flawlessly, including not only 6.3.x but also 6.4.x and 6.5.x. You're a hero! :)

Haha, glad to hear that. Thank you.

J. R. Okajima