rpm-software-management / rpm

The RPM package manager
http://rpm.org
Other
498 stars 359 forks source link

EXDEV error with rename(2) on overlayfs #2355

Open dmnks opened 1 year ago

dmnks commented 1 year ago

Running rpmdb --rebuilddb in a Fedora 37 podman container results in the following error:

error: failed to replace old database with new database!
error: replace files in /usr/lib/sysimage/rpm with files from /usr/lib/sysimage/rpmrebuilddb.12 to recover

Investigating further with strace, the culprit is actually the following rename call:

$ strace -e rename rpmdb --rebuilddb
rename("/usr/lib/sysimage/rpm", "/usr/lib/sysimage/rpmold.16") = -1 EXDEV (Invalid cross-device link)
error: failed to replace old database with new database!
error: replace files in /usr/lib/sysimage/rpm with files from /usr/lib/sysimage/rpmrebuilddb.16 to recover
+++ exited with 1 +++

Turns out, this is a limitation of overlayfs, as mentioned in https://github.com/torvalds/linux/blob/v4.8-rc2/fs/overlayfs/copy_up.c#L318-L322 (also reported e.g. here: https://github.com/moby/moby/issues/25409):

Directory renames only allowed on "pure upper" (already created on
upper filesystem, never copied up).  Directories which are on lower or
are merged may not be renamed.  For these -EXDEV is returned and
userspace has to deal with it.  This means, when copying up a
directory we can rely on it and ancestors being stable.

A more low-level reproducer using bubblewrap and fuse-overlayfs looks like this (on Fedora 37):

$ sudo dnf install -y --installroot=$PWD/tree --releasever=37 python3
$ sudo chmod u+w tree/
$ sudo chown test: -R tree/
$ mkdir tree/foo  # this will be renamed to reproduce the issue
$ mkdir upper work merged
$ fuse-overlayfs -o lowerdir=tree,upperdir=upper,workdir=work merged
$ bwrap --unshare-pid --dev-bind $PWD/merged / --dev /dev --proc /proc python3
Python 3.11.1 (main, Dec  7 2022, 00:00:00) [GCC 12.2.1 20221121 (Red Hat 12.2.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.rename('/foo', '/bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 18] Invalid cross-device link: '/foo' -> '/bar'

Currently, I'm not aware of any workaround.

It seems that the only way to make this work cleanly would be to add filesystem-specific code to RPM, however that doesn't sound great either.

Any thoughts?

dmnks commented 1 year ago

Note that the touch /usr/lib/sysimage/rpm/* workaround (formerly /var/lib/rpm/* ), also discussed in https://github.com/radiasoft/containers/issues/91, doesn't help.

pmatilai commented 1 year ago

Yep, this is the same old, many reports exist in various places.

For the purposes of the test-suite, can we not make the rpmdb a "pure upper" directory? Move the underlying directory away to start with and then put it in place from the upper layer?

Edit: hmm, in the test-suite case it should already be "pure upper" I think?

dmnks commented 1 year ago

Yep, this is the same old, many reports exist in various places.

Yup, I know :smile: It's just that the original copy-up (touch ...) trick (also used by dnf-plugin-ovl) doesn't help in this rename() case. I wonder if something changed in the overlayfs implementation recently or whether it was always the case. Anyway,

For the purposes of the test-suite, can we not make the rpmdb a "pure upper" directory? Move the underlying directory away to start with and then put it in place from the upper layer?

Indeed, this straightforward workaround somehow eluded me :smile:, thanks. I just tried it manually and it worked.

Edit: hmm, in the test-suite case it should already be "pure upper" I think?

Not really - the filesystem tree used by a test case is our "lower". Copying the database manually (outside of the container) to the "upper" directory should do the trick, as mentioned above.

dmnks commented 1 year ago

Yup, I know smile It's just that the original copy-up (touch ...) trick (also used by dnf-plugin-ovl) doesn't help in this rename() case. I wonder if something changed in the overlayfs implementation recently or whether it was always the case. Anyway,

Heh, to reply to myself:

Of course the copy-up trick doesn't work - as the comment from the overlayfs source code says, only "pure upper" directories can be renamed. Triggering a copy-up doesn't make the directory a "pure upper".

dmnks commented 1 year ago

Just for future reference and/or any confused onlookers - the "test cases" we mentioned above are in fact a WIP that's not on master yet. For those, the "make the database a pure upper" workaround does the job, however for the usual podman/docker use case, the issue remains.

pmatilai commented 1 year ago

Indeed. This is related to #1580.

Edit: hmm, in the test-suite case it should already be "pure upper" I think?

Not really - the filesystem tree used by a test case is our "lower". Copying the database manually (outside of the container) to the "upper" directory should do the trick, as mentioned above.

Yup, but in the test-suite, there really isn't an rpmdb in there, it's just an empty directory from "make install" that we could just as well create later from each individual test. The tests already do RPMDB_INIT pretty much everywhere, so all that we probably need is rm -rf <db> after "make install"

dmnks commented 1 year ago

Oh, right, right. The reason I encountered this issue in the first place is that my lower tree contained an rpmdb, in fact, which was just a result of a dnf --installroot call that populated it earlier. Indeed, removing it afterwards (before creating per-test overlays) is what needs to be done here.

dmnks commented 1 year ago

Related: https://github.com/rpm-software-management/rpm/pull/1754

dmnks commented 1 year ago

FTR, the mv(1) command works around this by simply recursively copying the directory and removing the old one. Therefore, the workaround is as simple as doing:

# cd /usr/lib/sysimage
# cp -r rpm rpmold.1
# rm -rf rpm
# rpmdb --rebuilddb
xuchunmei000 commented 10 months ago

FTR, the mv(1) command works around this by simply recursively copying the directory and removing the old one. Therefore, the workaround is as simple as doing:

# cd /usr/lib/sysimage
# cp -r rpm rpmold.1
# rm -rf rpm
# rpmdb --rebuilddb

these workaround steps make rpmdb --rebuilddb success, but the new db files in /usr/lib/sysimage/rpm are not correct, for example, rpm -q glibc will return failed with "package glibc is not installed"

dmnks commented 9 months ago

Oh... Indeed. Dunno what I was thinking. The correct workaround is this:

# cd /usr/lib/sysimage
# cp -r rpm rpm.temp  # copy-up onto the upper layer
# rm -rf rpm
# mv rpm.temp rpm  # move back to the original path so RPM can find it
# rpmdb --rebuilddb

Thanks for noticing!

dmnks commented 9 months ago

Anyway, it turns out this is not something we'll want to handle explicitly in the RPM code as it's really OverlayFS-specific. I'll close the ticket now.

dmnks commented 7 months ago

As per discussion in #2905, reopening now.