openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.65k stars 1.75k forks source link

Unable to access snapshot files using /.zfs/snapshot/testsnapshot/ #9461

Open roker opened 5 years ago

roker commented 5 years ago

System information

Type Version/Name
Distribution Name Fedora
Distribution Version 30
Linux Kernel 5.2.18
Architecture x86_64
ZFS Version 0.8.2-1
SPL Version 0.8.2-1

Describe the problem you're observing

Unable to access snapshot files using /.zfs/snapshot/test/ Verified on freebsd 11, this is working. Also ubuntu module 0.7.5 this is working. I cant stress enough how serious this issue is.

Describe how to reproduce the problem

sudo bash echo "should be working" > /opt/testsnapshot zfs -r snapshot roottank@testsnapshot rm /opt/testsnapshot cd /.zfs/snapshot/testsnapshot/opt bash: cd: opt: Object is remote

Include any warning/errors/backtraces from the system logs

There is no visible errors. -->

jhyeon commented 5 years ago

I had the same problem with zfs 0.8.2, so I downgraded zfs to version 0.8.1 which allows accessing the root snapshots fine. BTW, snapshots of the other filesystems mounted under / such as /home/.zfs/snapshot/* can still be accessed with zfs 0.8.2.

arkoort commented 5 years ago

The same problem here. Debian testing, zfs-dkms 0.8.2-2. Some additional info:

x220i:~% ls /.zfs/snapshot
2019-10-16-2057/                       zfs-auto-snap_daily4-2019-10-13-0412/    zfs-auto-snap_frequent-2019-10-17-1930/  zfs-auto-snap_hourly3-2019-10-17-1517/
zfs-auto-snap_daily-2019-10-12-0306/   zfs-auto-snap_daily8-2019-10-01-0311/    zfs-auto-snap_frequent-2019-10-17-1945/  zfs-auto-snap_hourly6-2019-10-17-0617/
zfs-auto-snap_daily-2019-10-14-0311/   zfs-auto-snap_daily8-2019-10-09-0257/    zfs-auto-snap_hourly-2019-10-17-1617/    zfs-auto-snap_hourly6-2019-10-17-1217/
zfs-auto-snap_daily-2019-10-15-0320/   zfs-auto-snap_daily8-2019-10-17-0318/    zfs-auto-snap_hourly-2019-10-17-1717/    zfs-auto-snap_hourly6-2019-10-17-1817/
zfs-auto-snap_daily-2019-10-16-0316/   zfs-auto-snap_frequent-2019-10-17-1900/  zfs-auto-snap_hourly-2019-10-17-1917/
zfs-auto-snap_daily4-2019-10-05-0258/  zfs-auto-snap_frequent-2019-10-17-1915/  zfs-auto-snap_hourly3-2019-10-17-0917/
x220i:~% ls /.zfs/snapshot/zfs-auto-snap_daily8-2019-10-09-0257/
x220i:~% ls /.zfs/snapshot/zfs-auto-snap_daily8-2019-10-09-0257/.
ls: cannot access '/.zfs/snapshot/zfs-auto-snap_daily8-2019-10-09-0257/.': Object is remote
zsh: exit 2     ls
x220i:~% zfs list -o space -rt all
NAME                                                     AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
x220                                                      159G   287G        0B     96K             0B       287G
x220/ROOT                                                 159G   270G        0B     96K             0B       270G
x220/ROOT/debian                                          159G   270G     25.1G    245G             0B         0B
x220/ROOT/debian@zfs-auto-snap_daily8-2019-10-01-0311        -  1.47G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily4-2019-10-05-0258        -  1.21G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily8-2019-10-09-0257        -  2.15G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily-2019-10-12-0306         -  1.36G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily4-2019-10-13-0412        -  1009M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily-2019-10-14-0311         -  1.05G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily-2019-10-15-0320         -  2.01G         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily-2019-10-16-0316         -  1.74G         -       -              -          -
x220/ROOT/debian@2019-10-16-2057                             -   183M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_daily8-2019-10-17-0318        -  52.3M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly6-2019-10-17-0617       -  42.8M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly3-2019-10-17-0917       -  10.5M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly6-2019-10-17-1217       -  17.3M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly3-2019-10-17-1517       -  12.1M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly-2019-10-17-1617        -  5.17M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly-2019-10-17-1717        -  5.86M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly6-2019-10-17-1817       -  37.6M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_frequent-2019-10-17-1915      -  5.95M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_hourly-2019-10-17-1917        -  6.00M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_frequent-2019-10-17-1930      -  11.8M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_frequent-2019-10-17-1945      -  13.8M         -       -              -          -
x220/ROOT/debian@zfs-auto-snap_frequent-2019-10-17-2000      -  6.68M         -       -              -          -
x220/home                                                 159G  80.1M        0B     96K             0B      80.0M
x220/home/ark                                             159G  80.0M        0B     96K             0B      79.9M
x220/home/ark/tmp                                         159G  79.9M        0B     96K             0B      79.9M
x220/home/ark/tmp/android                                 159G  79.9M        0B     96K             0B      79.8M
x220/home/ark/tmp/android/music                           159G  79.8M        0B   79.8M             0B         0B
x220/swap                                                 162G  4.25G        0B    642M          3.63G         0B
x220/var                                                  159G  12.4G        0B     96K             0B      12.4G
x220/var/tmp                                              159G  12.4G        0B     96K             0B      12.4G
x220/var/tmp/gobuild                                      159G  12.4G        0B   12.4G             0B         0B

I switced to 0.8.8-2 (from 0.7.12-2+deb10u1) at 2019-10-12. The snapshots were created automatically, before switch and after. All snapshots have the sensible size, I think, they're fully functional, except that I cannot access them.

jhyeon commented 5 years ago

PR #9384 which resolves #9381, still does not allow me to access snapshots under /.zfs/snapshot/. The reason might be that I am using initramfs which first mounts the root filesystem under /sysroot/ before doing switch_root. See my comments in #9381.

ccorn commented 5 years ago

Same here. Arch Linux, kernel 5.3.7, with package zfs-dkms 0.8.2-1 (archzfs-dkms) Snapdirs of non-root-filesystems can be accessed without problems, e.g.

ls /home/.zfs/snapshot/20191030-2155

works. journalctl -b -g zfs then shows lines such as:

Oct 31 10:37:04 Celsius systemd[6199]: home-.zfs-snapshot-20191030\x2d2155.mount: Succeeded.
Oct 31 10:37:04 Celsius systemd[1]: home-.zfs-snapshot-20191030\x2d2155.mount: Succeeded.

And grep -F .zfs /proc/mounts shows the mount for a while. However, trying ls /.zfs/snapshot/20191030-2155 yields no output and no error message. Doing ls /.zfs/snapshot/20191030-2155/. (note the appended /.) yields the error:

ls: cannot access '/.zfs/snapshot/20191030-2155/.': Object is remote

And there are no associated entries in the output of journalctl -b -g zfs nor grep -F .zfs /proc/mounts.

Any hints on how to dig further?

ElvishJerricco commented 4 years ago

I also get this error for any datasets that have mountpoint=legacy with 0.8.2.

ElvishJerricco commented 4 years ago

I take that back. It's all my datasets that are mounted during initramfs that have the problem; my legacy mount that's mounted in stage 2 boot doesn't have this problem. Also, of course, if I create a new legacy mount, I can mount it and access its .zfs/snapshot/foo contents.

Bronek commented 4 years ago

I have the same problem, it is a regression in 0.8.2 release (it worked fine in 0.8.1)

behlendorf commented 4 years ago

@Bronek you're likely seeing the issue resolved by PR #9384. We'll get the fix applied to 0.8.3.

Ryushin commented 4 years ago

Posting this in here since #9384 was closed:

Looks like I found a different kind of bug related to this. I've applied the patch listed above. (#9353 My rpool looks like this: rpool 1.27T 2.95T 176K none rpool/ROOT 145G 2.95T 50.1G / rpool/mysql 53.0G 2.95T 23.0G /var/lib/mysql rpool/mysql-log 5.79G 2.95T 1.44G /var/lib/mysql-log rpool/plexmediaserver 206G 2.95T 171G /var/lib/plexmediaserver rpool/spool 115G 2.95T 35.5G /var/spool rpool/virtual_machines 778G 2.95T 743G /var/lib/libvirt

I can see inside snapshots for rpool/ROOT: windwalker:~# ls /.zfs/snapshot/zfs-auto-snap_daily-2019-12-07-1107/ bin boot.tar.xz dead.letter etc lib lost+found mnt opt proc run share sys tmp var boot cdrom dev home lib64 media netshares path root sbin srv tftpboot usr

Looking inside snapshots for any other dataset in rpool results in "Too many levels of symbolic links" error: ls /var/lib/libvirt/.zfs/snapshot/zfs-auto-snap_daily-2019-12-07-1107/ ls: cannot access '/var/lib/libvirt/.zfs/snapshot/zfs-auto-snap_daily-2019-12-07-1107/': Too many levels of symbolic links

ccorn commented 4 years ago

I just upgraded to 0.8.3, and the issue remains as described in my earlier post.

$ ls /home/.zfs/snapshot/
20200121-1930  20200123-0040  20200124-2134  20200126-0146
20200122-1325  20200123-1559  20200125-1946  20200126-2030
$ ls /home/.zfs/snapshot/20200126-2030  # works
ccorn
$ ls /.zfs/snapshot/
20200121-1930  20200123-0040  20200124-2134  20200126-0146
20200122-1325  20200123-1559  20200125-1946  20200126-2030
$ ls /.zfs/snapshot/20200126-2030  # no output
$ ls /.zfs/snapshot/20200126-2030/.
ls: cannot access '/.zfs/snapshot/20200126-2030/.': Object is remote
seonwoolee commented 4 years ago

I too am having this issue on 0.8.3 (though in my case it affects multiple filesystems under my zroot pool, which contains /). This issue doesn't occur in any of my other pools that do not contain /.

$ ls /home/.zfs/snapshot/
2020-0122-0020_weekly         2020-0128-0100_hourly         2020-0128-1400_hourly
2020-0109-0000_daily        2020-0122-1839_before-pacman  2020-0128-0200_hourly         2020-0128-1400_quarterhour
$ ls /home/.zfs/snapshot/2020-0128-1400_quarterhour/ #returns nothing
$ ls /.zfs/snapshot/
2020-0108-0834_weekly       2020-0122-0020_weekly         2020-0128-0100_hourly         2020-0128-1400_hourly
2020-0109-0000_daily        2020-0122-1839_before-pacman  2020-0128-0200_hourly         2020-0128-1400_quarterhour
$ ls /.zfs/snapshot/2020-0128-1400_quarterhour/ #returns nothing

(I've abbreviated the output of ls when listing the snapshots available)

marker5a commented 4 years ago

Would just like to chime in... seeing the same behavior on 0.8.3 and 0.8.2 zfs on Archlinux

One example from the logs:

1583461577 zfs_ctldir.c:1086:zfsctl_snapshot_mount(): Unable to automount /new_root/var/lib/docker/.zfs/snapshot/autosnap_2020-02-29_00:00:08_daily error=512

marker5a commented 4 years ago

I can also confirm that by creating a non-root pool, I can access everything fine in the snapshot folder, so seems to be similar to ElvishJerrico. Any root derived filesystem though does not work as expected

jhyeon commented 4 years ago

I suspect this error occurs on systems with initramfs which chroot(2)s to new root filesystem and starts init process by running switch_root(8).

ccorn commented 4 years ago

10128 indicates that reverting 5a1bf9e 093bb64 solves this particular issue. Indeed that works for me. But that reopens the issues those commits were meant to solve. What now?

marker5a commented 4 years ago

10128 indicates that reverting 5a1bf9e 093bb64 solves this particular issue. Indeed that works for me. But that reopens the issues those commits were meant to solve. What now?

This was a good workaround for me.. yeah, as to what now, I'll leave that to the gurus :)

gyakovlev commented 4 years ago

still happening on

zfs-0.8.4-r1-gentoo
zfs-kmod-0.8.4-r0-gentoo

non-root snapdirs are fine. but snapshots for root dataset shows up empty.

deeptho commented 4 years ago

Filename: /lib/modules/5.5.7-200.fc31.x86_64/extra/zfs.ko.xz version: 0.8.4-1

Slightly different behaviour: -non root filesystems are okm (snapshot dirs visible) -root filesystem is ok immediately after boot (napshot dirs visible) but problems occur aftersome heavy work (compiling kernel)

gyakovlev commented 4 years ago

found 2 workarounds (still on 0.8.4)

after that snapshots will be accessible in /sysroot/.zfs/ just fine, but remain inaccessible via /.zfs/

/sysroot is the initial location dracut mounts root dataset to via zfsutil so somehow bind-mounting to this exact path or re-mounting to / after pivot_root restores expected behaviour.

pineman commented 3 years ago

Also experiencing this on Arch Linux, I can't access the root snapdirs. The workaround envolving mount -o bind to the original root mountpoint before chroot in initramfs works for me (its /new_root on my system as you can see below).

$ ls /.zfs/snapshot/a
$ cat /proc/spl/kstat/zfs/dbgmsg
...
1621944136   zfs_ctldir.c:1105:zfsctl_snapshot_mount(): Unable to automount /new_root/.zfs/snapshot/a error=512

openzfs 2.0.4, kernel 5.10.38

carlosaguilarmelchor commented 3 years ago

Same here on Ubuntu 20.04. zfs-0.8.3-1ubuntu12.9 zfs-kmod-0.8.3-1ubuntu12.7

mcr-ksh commented 2 years ago

Same here.

Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focal 0.8.3-1ubuntu12.12 srcversion: 121E11703180683A8728698 name: zfs vermagic: 5.4.0-89-generic SMP mod_unload modversions

nicman23 commented 2 years ago

@mcr-ksh are you using legacy mounts ?

mcr-ksh commented 2 years ago

@nicman23 Not sure what legacy mounts are, but eventually. Im using: zfs mount pool/vol before I have set zfs set mountpoint=/dir pool/vol

sxc731 commented 2 years ago

Same issue on Ubuntu 21.10 running ZoL 2.0.6.

Both of @gyakovlev's workarounds worked for me; thank you so much for these!

Note that I could access .zfs/snapshot dirs for every filesystems but root (/.zfs/snapshot). I had the Unable to automount <snapdir> error=512 entries in /proc/spl/kstat/zfs/dbgmsg; my system is booted through ZFSbootmenu and dracut, which I strongly suspect uses a chroot mechanism similar to that described by @jhyeon above.

IsaacVaughn commented 2 years ago

I am still experiencing the problem with the root dir from initramfs. 1666904221 zfs_ctldir.c:1106:zfsctl_snapshot_mount(): Unable to automount /new_root/.zfs/snapshot/autosnap_2022-10-27_20:00:04_hourly error=512

Bind mounting to /new_root allows access.

Clete2 commented 1 year ago

Confirmed this is still a bug on 2.1.7. Sending a snapshot to another dataset allows me to access the files but I cannot access them from /.zfs/snapshot/...

Others confirmed this bug: https://forum.proxmox.com/threads/zfs-snapshot-problem.120172/

siv0 commented 1 year ago

The issue reported in the last comment seems like a "regression" introduced in zfs 2.1.7 (commit 4d22befde60087cbc6174122863353903df1d935 ) - at least on Debian(derived distros) with initramfs-tools:

A possible workaround might be to allow mount.zfs to provide the mntpoint hint explicitly or an option to set it to the mountpoint property of the dataset (for use in initramfs/dracut).

A very superficial grep through the code makes it seem that vfs_mntpoint is only used in this context?

motiejus commented 1 year ago

I the same problem, but with /var/lib/.zfs, and the sysroot workaround does not help:

[root@hel1-a:/]# mount -o bind /var/lib /sysroot

[root@hel1-a:/]# find sysroot/.zfs/snapshot/
sysroot/.zfs/snapshot/
sysroot/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily
sysroot/.zfs/snapshot/autosnap_2023-01-12_17:00:01_hourly
...

Same -- empty directories -- in /var/lib/.zfs/snapshot/<...>/.

siv0 commented 1 year ago

I think you'd need to bind mount /var/lib to /sysroot/var/lib in this case (without explicitly trying)

motiejus commented 1 year ago

I think you'd need to bind mount /var/lib to /sysroot/var/lib in this case (without explicitly trying)

I think it's equivalent:

[root@hel1-a:/]# mount -o bind / /sysroot

[root@hel1-a:/]# mount -o bind /var/lib /sysroot/var/lib

[root@hel1-a:/]# find /sysroot/var/lib/.zfs/
/sysroot/var/lib/.zfs/
/sysroot/var/lib/.zfs/snapshot
/sysroot/var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily
/sysroot/var/lib/.zfs/snapshot/autosnap_2023-01-12_17:00:01_hourly
/sysroot/var/lib/.zfs/snapshot/autosnap_2023-01-07_16:00:04_monthly
...
siv0 commented 1 year ago

Do the snapshots get mounted if you try accessing them under /var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily

The issue (AFAICU) is that the .zfs control directory exists on the dataset, which is mounted at /var/lib after pivot_root - but that it has /sysroot/var/lib as prefix for mounting snapshots. i.e. does ls -l /var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily cause the snapshot to be mounted under /sysroot/var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily ?

motiejus commented 1 year ago

Do the snapshots get mounted if you try accessing them under /var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily

Nope.

The issue (AFAICU) is that the .zfs control directory exists on the dataset, which is mounted at /var/lib after pivot_root - but that it has /sysroot/var/lib as prefix for mounting snapshots. i.e. does ls -l /var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily cause the snapshot to be mounted under /sysroot/var/lib/.zfs/snapshot/autosnap_2023-01-10_00:00:02_daily ?

Likewise, no.

dberlin commented 1 year ago

Nowadays, this is caused by selinux enforcement as well

as root setenforce 0 mount -o bind / /sysroot

Does that fix them?

I started by trying to fix the selinux policy to allow it, but it's way beyond my skills.

ipaqmaster commented 1 year ago

I seem to still have this issue today when trying to browse any /.zfs/snapshot/* on my ZFS Root.

zfs-2.1.9-1 Kernel: 6.1.11-arch1-1

m-ueberall commented 1 year ago

I seem to still have this issue today when trying to browse any /.zfs/snapshot/* on my ZFS Root.

zfs-2.1.9-1 Kernel: 6.1.11-arch1-1

Using the same ZFS version in conjunction with another kernel (see issue #11563), one unfortunate side effect of trying to access the contents of .../.zfs/snapshot/ I see is that the snapshots in question will henceforth be marked as busy (until you reboot the system), preventing their automatic removal:

/etc/cron.hourly/zfs-auto-snapshot:
cannot destroy snapshot rpool/ROOT/ubuntu_cozx9s/var/lib@zfs-auto-snap_hourly-2023-02-23-0917: dataset is busy
cannot destroy snapshot rpool/ROOT/ubuntu_cozx9s/var/lib@zfs-auto-snap_hourly-2023-02-23-0817: dataset is busy
jdwhite commented 1 year ago

Nowadays, this is caused by selinux enforcement as well [...] I started by trying to fix the selinux policy to allow it, but it's way beyond my skills.

Thanks for mentioning SElinux as a potential cause. Should have checked that earlier.

Linux mcp.home.menelos.com 6.2.14-300.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Mon May  1 00:55:28 UTC 2023 x86_64 GNU/Linux
Fedora release 38 (Thirty Eight)
zfs-2.1.11-1

I don't know when this stopped working, but it was definitely prior to 2.1.11-1. I've always run SElinux on this system even before I installed OpenZFS and snapshots were working at one point as my backup program uses the latest snapshot directory to create deltas. Failed backups is how I noticed that snapshots weren't being automounted.

As soon as I cd into a snapshot dir I see these entries in dbgmsg and audit.log:

# cat /proc/spl/kstat/zfs/dbgmsg | grep -i automount | tail -1
1683319089   zfs_ctldir.c:1105:zfsctl_snapshot_mount(): Unable to automount /spool/containers/plex/.zfs/snapshot/autosnap_2023-05-05_03:00:05_daily error=32000

# grep /var/log/audit/audit.log | grep -i denied | tail -1
type=AVC msg=audit(1683319089.711:370425): avc:  denied  { execute } for pid=1069422 comm="env" name="mount" dev="md127" ino=537413737 scontext=system_u:system_r:kernel_generic_helper_t:s0 tcontext=system_u:object_r:mount_exec_t:s0 tclass=file permissive=0

If you install the setroubleshoot package you'll get these in your syslog when an action is denied.

May  5 15:38:12 mcp setroubleshoot[1069685]: SELinux is preventing env from
execute access on the file /usr/bin/mount.#012#012*****  Plugin catchall (100. confidence) suggests   **************************#012#012If you believe that env should be allowed execute access on the mount file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'env' --raw | audit2allow -M my-env#012# semodule -X 300 -i my-env.pp#012

I didn't follow the setroubleshoot instructions as I have my own way of creating local SElinux policies with a Makefile. This was quite a process as I had to create, install, and test the policy about a dozen times before I was successful in getting snapshot automounting working again.

Below is the policyname.te file I created. Loading this allows the snapshot automounting behavior to work but I have not fully explored the security implications of this policy. Use as your own risk.

module zfs.local 1.0;

require {
    type mount_exec_t;
    type fs_t;
    type unlabeled_t;
    type container_file_t;
    type device_t;
    type kernel_generic_helper_t;
    class file { execute execute_no_trans getattr map open read };
    class dir { mounton search };
    class capability { setgid setuid sys_admin };
    class chr_file { ioctl open read write };
    class filesystem mount;
}

#============= kernel_generic_helper_t ==============

allow kernel_generic_helper_t container_file_t:dir search;
allow kernel_generic_helper_t device_t:chr_file { ioctl open read write };
allow kernel_generic_helper_t fs_t:filesystem mount;
allow kernel_generic_helper_t mount_exec_t:file { execute execute_no_trans getattr map open read };
allow kernel_generic_helper_t self:capability { setgid setuid sys_admin };
allow kernel_generic_helper_t unlabeled_t:dir { mounton search };
tilgovi commented 1 year ago

I'm no longer experiencing this issue testing out the upcoming Ubuntu 23.10.

Kernel 6.5.0.7 ZFS 2.2.0~rc3-0ubuntu4

darkbasic commented 1 year ago

I'm still experiencing some variation of this even with zfs 2.2-git:

[niko@arch-phoenix ~]$ ls -al /.zfs/snapshot/zrepl_20231009_050257_000/
total 0
drwxrwxrwx  1 root root 0 Oct  9 07:02 .
drwxrwxrwx 39 root root 2 Oct  9 08:52 ..
[niko@arch-phoenix ~]$ mount | grep zrepl_20231009_050257_000
[niko@arch-phoenix ~]$

As you can see sometimes it doesn't automount snapshots when you try to access them.

nabijaczleweli commented 10 months ago

just reproed this, but in a much funnier way:

nabijaczleweli@chrust:~$ ls .zfs/snapshot/pre-keymap/
ls: cannot access '/home/nabijaczleweli/.zfs/snapshot/pre-keymap/': Too many levels of symbolic links
nabijaczleweli@chrust:~$ ls /.zfs/snapshot/pre-keymap/
audio-script.tar  cros-keyboard-map  foreign.nabijaczleweli.xyz  typescript  xev.log
nabijaczleweli@chrust:~/code/babfig/i3status.rs$ findmnt
TARGET                                                SOURCE                                     FSTYPE      OPTIONS
/                                                     chrust-zoot                                zfs         rw,relatime,xattr,posixacl,casesensitive
├─/home                                               chrust-zoot/home                           zfs         rw,relatime,xattr,posixacl,casesensitive
│ └─/home/nabijaczleweli                              chrust-zoot/home/nabijaczleweli            zfs         rw,relatime,xattr,posixacl,casesensitive
└─/.zfs/snapshot/pre-keymap                           chrust-zoot/home/nabijaczleweli@pre-keymap zfs         ro,relatime,xattr,posixacl,casesensitive
$ uname -a
Linux chrust 6.6.9-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.9-1 (2024-01-01) x86_64 GNU/Linux
$ zfs --version
zfs-2.2.2-3
zfs-kmod-2.2.2-3

note how first it fails with ELOOP, but then the automounter somehow mounts chrust-zoot/home/nabijaczleweli@pre-keymap instead of chrust-zoot@pre-keymap on /.zfs/snapshot/pre-keymap!

using i-t on current sid (2024-01-09)

mschiff commented 9 months ago

I can reproduce this behavior with OpenZFS 2.2.2 on Linux 6.1.69, using dracut initramfs and the workaround mentioned above still works:

Empty:

 ~ # ls -l /.zfs/snapshot/zfs-auto-snap_hourly-2024-01-24-0001/
total 0

Workaround:

 ~ # mkdir -p /sysroot; mount -o bind / /sysroot
 ~ # ls -l /sysroot/.zfs/snapshot/zfs-auto-snap_hourly-2024-01-24-0001/
total 61531
drwxr-xr-x.  2 root root      120 Jan 21 02:56 bin
drwxr-xr-x.  4 root root       12 Jan 12 00:59 boot
drwxr-xr-x.  2 root root        4 Jan  8 18:03 dev
drwxr-xr-x. 59 root root      149 Jan 23 23:54 etc
drwxr-xr-x.  2 root root        2 Jan 11 01:11 home
drwxr-xr-x. 11 root root       37 Jan 17 17:17 lib
drwxr-xr-x.  7 root root      141 Jan 23 23:52 lib64
drwxr-xr-x.  2 root root        2 Jan  8 18:03 media
drwxr-xr-x.  4 root root        4 Jan 24 00:32 mnt
drwxr-xr-x.  3 root root        3 Jan 11 15:46 opt
drwxr-xr-x.  2 root root        2 Jan  8 18:03 proc
drwx------.  9 root root       20 Jan 24 00:40 root
drwxr-xr-x.  2 root root        2 Jan  8 18:03 run
drwxr-xr-x.  2 root root      188 Jan 22 10:21 sbin
drwxr-xr-x.  2 root root        2 Jan  8 18:03 sys
drwxrwxrwt.  9 root root        9 Jan 23 23:54 tmp
drwxr-xr-x. 12 root root       12 Jan  8 18:20 usr
drwxr-xr-x.  8 root root       10 Jan 12 09:56 var
drwxr-xr-x.  2 root root        2 Jan 12 16:58 zdata
 ~ #
prescientmoon commented 9 months ago

I still get this issue, and the workaround does not work.

$ zfs --version
zfs-2.2.2-1
zfs-kmod-2.2.2-1
$ uname -r
6.6.4
mschiff commented 9 months ago

I still get this issue, and the workaround does not work.

It only works if you bind-mount to where your rootfs was mounted by the initrd. Did you do this?

prescientmoon commented 9 months ago

I still get this issue, and the workaround does not work.

It only works if you bind-mount to where your rootfs was mounted by the initrd. Did you do this?

I was trying to access snapshots in a child of the root dataset. What I did was:

This didn't fix anything though. And yeah, as far as I know, /sysroot is indeed the correct path for it.

prescientmoon commented 9 months ago

In fact, it seems like snapshots do show up for the very root dataset

albert-a commented 5 months ago

Hi, I run into the same problem with current ZFS on Proxmox inside LXC.

pve-manager/8.2.2/9355359cd7afbae4 (running kernel: 6.8.4-3-pve)
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
zfsutils-linux: 2.2.3-pve2
zfs-2.2.3-pve2
zfs-kmod-2.2.3-pve2

ls inside the LXC container fails with Too many levels of symbolic links error. Although I experience no problems with accessing the same snapshots on the host.

But in the LXC container which is started on the dataset the behaviour is totally random:

Most of times both snapshots become accessible approximately in 4-5 minutes after container is started.

albert-a commented 3 months ago

Just linking the other threads related to the problem

  1. How to give LXD container access to (hundreds of) ZFS snapshots
  2. Access zfs-snapshots inside lxc container
mfundul commented 3 months ago

Below is a reproducer for ZFS version (Ubuntu 22.04):

$ zfs --version
zfs-2.1.5-1ubuntu6~22.04.4
zfs-kmod-2.2.0-0ubuntu1~23.10.3

The following steps reproduce the issue reliably:

# mkdir /chroot
# zpool create -f -m /chroot/repo-002 repo-002 /dev/sdb
# zfs create repo-002/test
# cd /chroot/repo-002/test/
# dd if=/dev/urandom of=0.sparse bs=1M count=128
128+0 records in
128+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0,273947 s, 490 MB/s
# zfs snapshot repo-002/test@now
# zfs list -t all
NAME                USED  AVAIL     REFER  MOUNTPOINT
repo-002            128M  3.75G       24K  /chroot/repo-002
repo-002/test       128M  3.75G      128M  /chroot/repo-002/test
repo-002/test@now     0B      -      128M  -

# cd /
# mount --rbind / /chroot
# chroot /chroot/
# ls /chroot/repo-002/test/.zfs/snapshot/now/

# tail /proc/spl/kstat/zfs/dbgmsg 
1723203770   spa_history.c:293:spa_history_log_sync(): command: zpool create -f -m /chroot/repo-002 repo-002 /dev/sdb
1723203775   spa_history.c:329:spa_history_log_sync(): ioctl reopen
1723203788   spa_history.c:297:spa_history_log_sync(): txg 12 create repo-002/test (id 142)  
1723203793   spa_history.c:329:spa_history_log_sync(): ioctl create
1723203793   spa_history.c:293:spa_history_log_sync(): command: zfs create repo-002/test
1723203825   spa_history.c:297:spa_history_log_sync(): txg 20 snapshot repo-002/test@now (id 256)  
1723203830   spa_history.c:329:spa_history_log_sync(): ioctl snapshot
1723203830   spa_history.c:293:spa_history_log_sync(): command: zfs snapshot repo-002/test@now
1723204035   zfs_ctldir.c:1161:zfsctl_snapshot_mount(): Unable to automount /chroot/repo-002/test/.zfs/snapshot/now error=512
1723204035   zfs_ctldir.c:1161:zfsctl_snapshot_mount(): Unable to automount /chroot/repo-002/test/.zfs/snapshot/now error=512
nabijaczleweli commented 3 months ago

I can repro this reproducer (bdf4d6be1de870b16d4f7997b235d9f19dd7e30e):

# mkdir /tmp/chroot
# truncate -s 1G /tmp/1G
# zpool create -f -m /tmp/chroot/repo-002 repo-002 /tmp/1G
# zfs create repo-002/test

# cd /tmp/chroot/repo-002/test/
# head -c 32M < /dev/urandom > 0.sparse
# zfs snapshot repo-002/test@now
# zfs list -t all
# cd /
# mount --rbind / /tmp/chroot
# chroot /tmp/chroot/
# ls /tmp/chroot/repo-002/test/.zfs/snapshot/now/

# tail /proc/spl/kstat/zfs/dbgmsg 
...
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1158:zfsctl_snapshot_mount(): automount='/usr/bin/env' 'mount' '-t' 'zfs' '-n' 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now' '(null)'
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1161:zfsctl_snapshot_mount(): automount=error=0x100
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1164:zfsctl_snapshot_mount(): Unable to automount /tmp/chroot/repo-002/test/.zfs/snapshot/now error=256
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1158:zfsctl_snapshot_mount(): automount='/usr/bin/env' 'mount' '-t' 'zfs' '-n' 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now' '(null)'
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1161:zfsctl_snapshot_mount(): automount=error=0x100
1723819083   ffff8c8ab222b080 zfs_ctldir.c:1164:zfsctl_snapshot_mount(): Unable to automount /tmp/chroot/repo-002/test/.zfs/snapshot/now error=256

But also re-running the upcall did work, so:

# mount -t zfs -n 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now'
# ls '/tmp/chroot/repo-002/test/.zfs/snapshot/now'
0.sparse

?

And: funny moment, after leaving the chroot, mount says

repo-002 on /tmp/chroot/tmp/chroot/repo-002 type zfs (rw,relatime,xattr,noacl,casesensitive)
repo-002/test on /tmp/chroot/tmp/chroot/repo-002/test type zfs (rw,relatime,xattr,noacl,casesensitive)
repo-002/test@now on /tmp/chroot/tmp/chroot/repo-002/test/.zfs/snapshot/now type zfs (ro,relatime,xattr,noacl,casesensitive)
repo-002/test@now on /tmp/chroot/repo-002/test/.zfs/snapshot/now type zfs (ro,relatime,xattr,noacl,casesensitive)

Conversely, on my real system (2.1.11-1), the logged path starts with /sysroot (but naturally, my real root isn't below /sysroot anymore):

1723819321   zfs_ctldir.c:1105:zfsctl_snapshot_mount(): Unable to automount /sysroot/etc/.zfs/snapshot/2024-03-03 error=512

(and note the error being 0x200 here).


If I explicitly break the upcall I get the ELOOP we see elsewhen:

root@kasan-test-nokasan:/# ls /tmp/chroot/repo-002/test/.zfs/snapshot/now/
ls: cannot access '/tmp/chroot/repo-002/test/.zfs/snapshot/now/': Too many levels of symbolic links
root@kasan-test-nokasan:/# tail /proc/spl/kstat/zfs/dbgmsg
1723823595   ffffa06605998000 zfs_ctldir.c:1162:zfsctl_snapshot_mount(): automount='/tmp/env' 'mount' '-t' 'zfs' '-n' 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now' '(null)'
1723823595   ffffa06605998000 zfs_ctldir.c:1165:zfsctl_snapshot_mount(): automount=error=0xfffffffe
1723823595   ffffa06605998000 zfs_ctldir.c:1162:zfsctl_snapshot_mount(): automount='/tmp/env' 'mount' '-t' 'zfs' '-n' 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now' '(null)'

then fixing it i see the commentless failure:

root@kasan-test-nokasan:/# cp /bin/env /tmp/
root@kasan-test-nokasan:/# ls /tmp/chroot/repo-002/test/.zfs/snapshot/now/
root@kasan-test-nokasan:/# tail /proc/spl/kstat/zfs/dbgmsg
1723823695   ffffa06605998000 zfs_ctldir.c:1168:zfsctl_snapshot_mount(): Unable to automount /tmp/chroot/repo-002/test/.zfs/snapshot/now error=256
1723823695   ffffa06605998000 zfs_ctldir.c:1162:zfsctl_snapshot_mount(): automount='/tmp/env' 'mount' '-t' 'zfs' '-n' 'repo-002/test@now' '/tmp/chroot/repo-002/test/.zfs/snapshot/now' '(null)'
1723823695   ffffa06605998000 zfs_ctldir.c:1165:zfsctl_snapshot_mount(): automount=error=0x100

and by patching around to show the log, I see the error from the mount:

root@kasan-test-nokasan:/# cat /tmp/env
#!/bin/sh
exec > /dev/pts/0 2>&1
echo "$@"
exec "$@"
root@kasan-test-nokasan:/# ls /tmp/chroot/repo-002/test/.zfs/snapshot/now/
mount -t zfs -n repo-002/test@now /tmp/chroot/repo-002/test/.zfs/snapshot/now
mount: /tmp/chroot/repo-002/test/.zfs/snapshot/now: mount point does not exist.
       dmesg(1) may have more information after failed mount system call.
mount -t zfs -n repo-002/test@now /tmp/chroot/repo-002/test/.zfs/snapshot/now
mount: /tmp/chroot/repo-002/test/.zfs/snapshot/now: mount point does not exist.
       dmesg(1) may have more information after failed mount system call.

(and, again, running mount -t zfs -n repo-002/test@now /tmp/chroot/repo-002/test/.zfs/snapshot/now works)

This, to me, reads as "the upcall runs in the wrong namespace"? maybe?

this is also unrelated to the path being wrong i think

SpiffyB commented 1 month ago

Could this at all be related to the issue we're seeing here? https://github.com/openzfs/zfs/issues/14223

When snapdev=visible is set for the pool snapshots are readable via /dev/zvol/..., but running a zfs rename on a snapshot will not update the symlink in /dev/zvol/.

Interestingly, setting snapdev back to hidden will hide all snapshots except for the ones that were renamed.