Closed almereyda closed 1 year ago
I think this is a combination of a documentation shortfall, and an opportunity for a protecting users by testing IVs when receiving.
On the documentation side, the Encryption section of the ZFS manpage should stress that critical encryption information is stored within the encryptionroot, and that destroying all copies of an encryptionroot will cause data loss.
On the protection side, perhaps zfs receive should warn or fail whenever an encryptionroot is discarded using the '-d' argument to zfs receive. Additionally, zfs receive could compare the IV of existing encryptionroots to the received encryptionroot and ensure they were/are the same if encryptionroot inheritance is being preserved.
the handling of -o encryption
is woefully inadequate and one of those things that was planned to revisit, but Datto no longer seems to be working on this.
zfs receive should warn or fail whenever an encryptionroot is discarded
That would be an option that could have helped me assess the situation better, prior to discarding the source pool. Then as a consequence, ultimately it appears to me -d
is primarily an option for backing up datasets that will eventually be restored into their source again.
Additionally it now comes to my mind, we also need a mechanism to back up and restore encryption headers, IVs and keys, as we would do with any LUKS device as well. Wasn't this something the kind of binary @tcaputi brought up could have helped with?
Additionally it now comes to my mind, we also need a mechanism to back up and restore encryption headers, IVs and keys, as we would do with any LUKS device as well. Wasn't this something the kind of binary @tcaputi brought up could have helped with?
I would guess that a zfs send
of the encryptionroot dataset would include the headers and IVs necessary for any dataset that used that encryption root, but it would also include all of the data in that dataset. I always keep my encryptionroot datasets empty of data and unmounted, just like my pool root datasets, but not everyone does that.
Just started using ZoL with native encryption and think I have hit the same or a similar bug (related to #6624 as well).
truncate -s 100M /root/src.img
truncate -s 100M /root/replica.img
zpool create src /root/src.img
zpool create replica /root/replica.img
zfs create -o encryption=on -o keyformat=passphrase -o keylocation=prompt src/encrypted
zfs create -o encryption=on -o keyformat=passphrase -o keylocation=prompt replica/encrypted
zfs create src/encrypted/a
dd if=/dev/urandom of=/src/encrypted/a/test1.bin bs=1M count=1
zfs snap src/encrypted/a@test1
zfs send -Rvw src/encrypted/a@test1 | zfs receive -svF replica/encrypted/a
zfs mount -l replica/encrypted
zfs mount -l replica/encrypted/a
zfs change-key -i replica/encrypted/a
zfs umount -u replica/encrypted
zfs mount -l replica/encrypted
zfs mount replica/encrypted/a
All good at this point. Everything works as expected. Now, do an incremental send:
dd if=/dev/urandom of=/src/encrypted/a/test2.bin bs=1M count=1
zfs snap src/encrypted/a@test2
zfs send -RvwI @test1 src/encrypted/a@test2 | zfs receive -svF replica/encrypted/a
# ls -al /replica/encrypted/a/
total 2056
drwxr-xr-x 2 root root 4 Sep 26 03:59 .
drwxr-xr-x 3 root root 3 Sep 26 03:57 ..
-rw-r--r-- 1 root root 1048576 Sep 26 03:55 test1.bin
-rw-r--r-- 1 root root 1048576 Sep 26 03:59 test2.bin
Again, all good. Now unmount/mount:
zfs umount -u replica/encrypted
zfs mount -l replica/encrypted
# zfs get encryptionroot,keystatus -rt filesystem replica/encrypted
NAME PROPERTY VALUE SOURCE
replica/encrypted encryptionroot replica/encrypted -
replica/encrypted keystatus available -
replica/encrypted/a encryptionroot replica/encrypted -
replica/encrypted/a keystatus available -
# zfs mount -l replica/encrypted/a
cannot mount 'replica/encrypted/a': Permission denied
Yikes! This appears to have corrupted 10TB of backup filesystems. I've been trying to recover from this but no luck so far.
If I don't run change-key
then I can send incrementals, unmount, and mount no problem (I just have to enter the password in twice). If I run change-key
then unmount/mount still no problem. It's when I run change-key
and then send an incremental snapshot that seems to render the filesystem unmountable.
After running change-key
and sending an incremental, once the filesystem is unmounted it can't be mounted again. It looks like the encryption root absolutely has to be replicated to prevent this from happening. If I replicate the encryption root then everything works as expected.
I may have also uncovered another bug in trying to recover from this. If I run zfs change-key -o keylocation=prompt -o keyformat=passphrase replica/encrypted/a
, after entering the new passwords the command hangs forever due to a panic. I have to completely reset the system.
[ 7080.228309] VERIFY3(0 == spa_keystore_dsl_key_hold_dd(dp->dp_spa, dd, FTAG, &dck)) failed (0 == 13)
[ 7080.228369] PANIC at dsl_crypt.c:1450:spa_keystore_change_key_sync_impl()
[ 7080.228399] Showing stack for process 1120
[ 7080.228403] CPU: 2 PID: 1120 Comm: txg_sync Tainted: P O 5.11.0-36-generic #40-Ubuntu
[ 7080.228406] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
[ 7080.228408] Call Trace:
[ 7080.228414] show_stack+0x52/0x58
[ 7080.228424] dump_stack+0x70/0x8b
[ 7080.228431] spl_dumpstack+0x29/0x2b [spl]
[ 7080.228448] spl_panic+0xd4/0xfc [spl]
[ 7080.228459] ? dsl_wrapping_key_rele.constprop.0+0x12/0x20 [zfs]
[ 7080.228597] ? spa_keystore_dsl_key_hold_dd+0x1a8/0x200 [zfs]
[ 7080.228687] spa_keystore_change_key_sync_impl+0x3c0/0x3d0 [zfs]
[ 7080.228776] ? zap_lookup+0x16/0x20 [zfs]
[ 7080.228899] spa_keystore_change_key_sync+0x157/0x3c0 [zfs]
[ 7080.228988] ? dmu_buf_rele+0xe/0x10 [zfs]
[ 7080.229064] ? dsl_dir_rele+0x30/0x40 [zfs]
[ 7080.229189] ? spa_keystore_change_key_check+0x178/0x4f0 [zfs]
[ 7080.229324] dsl_sync_task_sync+0xb5/0x100 [zfs]
[ 7080.229418] dsl_pool_sync+0x365/0x3f0 [zfs]
[ 7080.229507] spa_sync_iterate_to_convergence+0xe0/0x1e0 [zfs]
[ 7080.229609] spa_sync+0x305/0x5b0 [zfs]
[ 7080.229718] txg_sync_thread+0x26c/0x2f0 [zfs]
[ 7080.229835] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 7080.229952] thread_generic_wrapper+0x79/0x90 [spl]
[ 7080.229963] kthread+0x11f/0x140
[ 7080.229970] ? __thread_exit+0x20/0x20 [spl]
[ 7080.229980] ? set_kthread_struct+0x50/0x50
[ 7080.229984] ret_from_fork+0x22/0x30
I've tested this on (all x64_64):
I think you've got the problem mostly identified. While your zfs change-key -i
command exposed the problem, the root problem here was originally creating the receive encryption root manually, rather than receiving it from the source:
zfs create -o encryption=on -o keyformat=passphrase -o keylocation=prompt replica/encrypted
. Doing this means that your replica/encrypted
dataset has a different IV than your src/encrypted
dataset.
When you zfs change-key -i replica/encrypted/a
, that tells ZFS to inherit, which it happily does (without checking to see if the IVs match). I think ZFS happens to continue to allow you to use the FS until you send over another snapshot because the old IVs for those datasets are still associated with them somehow, and it just happens to keep working. If you rebooted, here, before doing the snapshot, it might stop working.
There's clearly a documentation shortfall, and some opportunities for code to protect users as well as some tooling to repair/import some of this data in disaster recovery situations. In your current situation, you may be able to recover by rolling back your receive dataset to eliminate all of the snapshots made after the change-key -i
, then re-sending your incrementals.
Doing this means that your replica/encrypted dataset has a different IV than your src/encrypted dataset.
So it sounds to me like it's OK to have child datasets with a different IV but the tooling should absolutely not be allowing zfs change-key -i
on these child datasets?
If you rebooted, here, before doing the snapshot, it might stop working.
After running zfs change-key -i replica/encrypted/a
I can still mount the filesystem after a reboot. It's only when I send the first incremental and then reboot (or simply unmount/mount) that I get cannot mount 'replica/encrypted/a': Permission denied
.
you may be able to recover by rolling back your receive dataset
Unfortunately this does not work. The encryptionroot
is still set to replica/encrypted
and I still get a permission denied on mount.
Another thing I noticed: while keys are loaded I tried doing zfs send -Rvw replica/encrypted/a@test1 | zfs receive -svF replica/a_restore
. But when I do a zfs mount -l replica/a_restore
and enter the password, I get Key load error: Incorrect key provided for 'replica/encrypted/a_restore'.
I tried this a several times and get the same result: the password is no longer accepted.
Then I tried doing zfs change-key -o keylocation=prompt -o keyformat=passphrase replica/encrypted
. Before I tried this on the child dataset and got a panic. Same deal:
[ 950.488754] VERIFY3(0 == spa_keystore_dsl_key_hold_dd(dp->dp_spa, dd, FTAG, &dck)) failed (0 == 13)
[ 950.488858] PANIC at dsl_crypt.c:1450:spa_keystore_change_key_sync_impl()
[ 950.488918] Showing stack for process 1131
[ 950.488924] CPU: 5 PID: 1131 Comm: txg_sync Tainted: P O 5.11.0-36-generic #40-Ubuntu
[ 950.488930] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
[ 950.488933] Call Trace:
[ 950.488942] show_stack+0x52/0x58
[ 950.488958] dump_stack+0x70/0x8b
[ 950.488970] spl_dumpstack+0x29/0x2b [spl]
[ 950.489001] spl_panic+0xd4/0xfc [spl]
[ 950.489024] ? dsl_wrapping_key_rele.constprop.0+0x12/0x20 [zfs]
[ 950.489341] ? spa_keystore_dsl_key_hold_dd+0x1a8/0x200 [zfs]
[ 950.489518] spa_keystore_change_key_sync_impl+0x3c0/0x3d0 [zfs]
[ 950.489703] spa_keystore_change_key_sync_impl+0x12c/0x3d0 [zfs]
[ 950.489878] ? zap_lookup+0x16/0x20 [zfs]
[ 950.490121] spa_keystore_change_key_sync+0x157/0x3c0 [zfs]
[ 950.490423] ? dmu_buf_rele+0xe/0x10 [zfs]
[ 950.490557] ? dsl_dir_rele+0x30/0x40 [zfs]
[ 950.490729] ? spa_keystore_change_key_check+0x178/0x4f0 [zfs]
[ 950.490886] dsl_sync_task_sync+0xb5/0x100 [zfs]
[ 950.491066] dsl_pool_sync+0x365/0x3f0 [zfs]
[ 950.491270] spa_sync_iterate_to_convergence+0xe0/0x1e0 [zfs]
[ 950.491473] spa_sync+0x305/0x5b0 [zfs]
[ 950.491686] txg_sync_thread+0x26c/0x2f0 [zfs]
[ 950.491927] ? txg_dispatch_callbacks+0x100/0x100 [zfs]
[ 950.492159] thread_generic_wrapper+0x79/0x90 [spl]
[ 950.492213] kthread+0x11f/0x140
[ 950.492224] ? __thread_exit+0x20/0x20 [spl]
[ 950.492242] ? set_kthread_struct+0x50/0x50
[ 950.492249] ret_from_fork+0x22/0x30
This happens even when I destroy the corrupted child dataset so it appears that not only is this corrupting the replicated child datasets but it's also corrupting the parent dataset as well.
There's clearly a documentation shortfall
Yeah this was shockingly easy to step in to and pretty scary considering it appears to work just fine until that first reboot or unmount/mount!
So it sounds to me like it's OK to have child datasets with a different IV but the tooling should absolutely not be allowing zfs change-key -i on these child datasets?
I am not a ZFS expert, but I would guess that that is the case. I believe that change-key -i
requires both keys to be loaded, and probably even checks or changes the key for the child dataset as needed, but neglects to do the same for the IVs.
@almereyda What you are observing should've been fixed by #9309 which was merged two years ago. Unfortunately it was never included in a 0.8.x release, this looks like an oversight to me. It is included in the 2.0.0 release though. And indeed I can't reproduce this on a somewhat current master.
Regarding your inaccessible data, let's summarize key handling for a better understanding of what 's happening here.
Every dataset has its master key which is shared with all snapshots and clones of that dataset. This is a random key generated on dataset creation and never changes. It is stored encrypted on disk. The encryption keys which are used to encrypt the data are derived from that master key. The on disk master key is encrypted with a wrapping key which in turn is either passed in verbatim
via keyformat=raw|hex
or generated by PBKDF2 which uses a passphrase and a salt as inputs (keyformat=passphrase
). The encryptionroot
property refers to the dataset where the wrapping key information is stored.
You are now in a situation where you can't decrypt the master key of the pool1/test
dataset since the wrapping key the encryption root
refers to is different from the one used to encrypt the master key. The PBKDF2 salt changed when you recreated the pool1
dataset, meaning the same passphrase will now generate a different wrapping key. Whether or not your data is recoverable depends on the accessibility of the old PBKDF2 salt. It may still be stored in the pool1/test
dataset. If this is true a custom binary could make the pool1/test
dataset its own encryption root and the data would be accessible again. This is basically what zfs change-key
does if you give it the right -o options and run it on a non-encryptionroot dataset. But it requires the keys to be loaded since it has to re-encrypt the master key. So this custom binary needs to be written or the zfs
command extended to allow for that. I'm not aware of any other options to recover the data.
Please note that there's a bit of speculation in the above since I have no zfs 0.8.x lying around to reproduce with. If you're desperate enough I can give you some commands to check my assumptions.
@AttilaFueloep That's interesting backstory. However, no where in @almereyda's listed commands is -d
included. The only listed arguments to a zfs receive
are -svF
. Can this still happen without the -d
?
Well, this is in the reproducer.
zfs send -Rw pool1/test@snap | zfs recv -d pool2 zpool destroy pool1 cfdisk /dev/sda # to downsize the partition zpool create ... pool1 /dev/sda3 zfs send -Rw pool2/test@snap | zfs recv -d pool1
recv -d
is even mentioned in the subject of this issue.
I think you are referring to the issue @brenc reported here. That one is different since it involves changing the encryptionroot
via zfs change-key -i
. I haven't found time to look into that yet.
Ahh yes. We may have two different problems that ended up in the same issue. They could be related, or they could not be. Perhaps we can spin off the @brenc problem into a new issue, and close @AttilaFueloep's as resolved in 2.0?
Thank you for the detailed explanation @AttilaFueloep! Especially the details on the relation between the salts and the IVs as wrapping keys for the key helped me a lot in understanding the mechanics involved here.
The original encryption root is not available anymore, but the mirrors of its descendants are. I would like to test these commands to check your assumptions, @AttilaFueloep. If we were to fabricate said binary, or extension to the zfs
command, not speaking of the earlier suggestions from you, @secabeen:
On the documentation side, the Encryption section of the ZFS manpage should stress that critical encryption information is stored within the encryptionroot, and that destroying all copies of an encryptionroot will cause data loss.
On the protection side, perhaps zfs receive should warn or fail whenever an encryptionroot is discarded using the '-d' argument to zfs receive. Additionally, zfs receive could compare the IV of existing encryptionroots to the received encryptionroot and ensure they were/are the same if encryptionroot inheritance is being preserved.
even better, and we might all benefit from this at some point. I'm not able to tell if #9309 already addresses any of the technicalities involved here.
Just to confirm the issue as prevailing (with some little help from https://github.com/openzfs/zfs/issues/4553#issuecomment-632068563) on a dangling copy of the locked-up datasets (despite the key is present and known):
$ zfs get keystatus pool1/test/test
NAME PROPERTY VALUE SOURCE
pool1/test/test keystatus available -
$ mount -t zfs -o zfsutil pool1/test/test /mnt
filesystem 'pool1/test/test' can not be mounted: Permission denied
As a takaway from @secabeen's other comment
I always keep my encryptionroot datasets empty of data and unmounted, just like my pool root datasets, but not everyone does that.
my strategy for sure now will be to keep a zpool's native dataset unmountable and unencrypted, and have its capital-letter descendants work as unmountable encrytion roots from now on. Which also opens up the possibility to use different keys for each, which is nice.
A strategy and way to recursively and incrementally replicate (un)encrypted datasets across machines into (1) new encryptionroots or (2) raw mirroring the original encryption hierarchy, where the encryption keys are not loaded in the same place, but eventually at the same time, is left open for me to discover. Previously non-raw send/receive worked well with encrypted datasets between pools on the same machine, and we're new to do so between different ones. I guess here we would be dealing with accomodations for the cases of:
keystatus: available
encrypted > (decrypted) > reencrypted below new encryptionrootkeystatus: unavailable
encrypted > (raw) > below
To cite the current documentation on https://openzfs.github.io/openzfs-docs/man/8/zfs-recv.8.html (emphasis by me):
If the -d option is specified, all but the first element of the sent snapshot's file system path (usually the pool name) is used and any required intermediate file systems within the specified one are created.
Now that's a bummer if the dataset with the name of the pool is incidentally also a required intermediate file system, acting as an encryptionroot
, for example.
If here we do not aspire to extend the (online) documentation as suggested above, or seek for an implementation to extract and implant IV salts, it is fine with me to close the discussion as won't fix (anymore), and forking off the related case into its separate issue.
Else, would there be any non-destructive way for me to check if the IVs are still present in the datasets, and if or how to use change-key
for making a dataset its own encryptionroot, in case before that was inherited?
Especially the details on the relation between the salts and the IVs as wrapping keys
A minor clarification. In this context an IV (Initialization Vector, sometimes also called nonce (number used once)) is a block of random data which is used to make sure that the same key and plaintext always produce a different ciphertext. It is generated randomly for each block to encrypt. To Decrypt the block you need both, the key and the IV. The IV used to encrypt the master key is stored on disk alongside the encrypted master key. The wrapping key is generated from the passphrase. A salt is a similar concept but applies to the passphrase. Again you need both, the passphrase and the salt to generate the wrapping key.
I would like to test these commands to check your assumptions,
Never mind, I could dig out an old enough zfs installation and reproduce the issue. Unfortunately I don't see a way to recover the lost salt, so I'm afraid your data isn't recoverable. By destroying the originating pool you lost the wrapping key since you lost the salt.
I'm not able to tell if #9309 already addresses any of the technicalities involved here.
Yes they are addressed. If receiving a raw zstream with a missing encryption root the topmost received dataset is made the new encryption root of itself and the datasets below it. #9309 just fixed a bug which broke this process in the case of recv -d
Else, would there be any non-destructive way for me to check if the IVs are still present in the datasets, and if or how to use
change-key
for making a dataset its own encryptionroot, in case before that was inherited?
Allthough currently there is no way to restore key information you can dump it in plain text by doing
zfs snapshot data/set@snap
zfs send -w data/set@snap | zstreamdump | sed -n -e '/crypt_keydata/,/end crypt/p; /END/q'
To make an encrypted dataset the encryption root of itself and all descendant datasets do the following
zfs change-key -o keyformat=passphrase -o keylocation=promt data/set
there is no need for keyformat and keylocation to differ from the ones of the current encryption root but the keys must be loaded.
it's a pity the OpenZFS 2.x release line hadn't been backported to Ubuntu 20.04 LTS
You could ask Canonical to include #9309 in their next zfs update. It should apply cleanly to the 0.8 tree. As far as I know they do backport certain ZFS changes to their (LTS) distro tree.
Ahh yes. We may have two different problems that ended up in the same issue. They could be related, or they could not be. Perhaps we can spin off the @brenc problem into a new issue, and close @AttilaFueloep's as resolved in 2.0?
Although this isn't my issue I'd say that's the way to go.
@brenc Would you mind opening a new issue and just copy over what you wrote in this one? I think the original issue is fixed by #9309 and yours is different.
@brenc I can reproduce your issue on a somewhat current master. In short the problem is that the incremental receive overwrites the master key of the replica/encrypted/a
dataset with the received key which is encrypted with the wrapping key form src/encrypted
. Since the unencrypted master key is cached in memory this goes unnoticed until it is unloaded by the unmount. A subsequent mount tries to decrypt the master key with the wrapping key from replica/encrypted
which obviously fails.
Please see https://github.com/openzfs/zfs/issues/12000#issuecomment-933006046 for the terminology used above.
Off the top my head I can't come up with a simple fix, I've to think more about it. To recover the replica/encrypted/a
dataset one would need a custom binary which decrypts the master key with the wrapping key from src/encrypted
and reencrypts it with the replica/encrypted
wrapping key.
This definitely is a different problem and deserves its own issue.
@brenc Would you mind opening a new issue and just copy over what you wrote in this one? I think the original issue is fixed by #9309 and yours is different.
Done #12614
Off the top my head I can't come up with a simple fix
Luckily for me I was able to just start over and replicate the entire dataset from the encryption root up. Hopefully there aren't too many people out there doing what I did who haven't rebooted / remounted in months...
Thanks for the detailed info. I appreciate knowing more about how this stuff works.
Thanks, I'll continue over there.
Hopefully there aren't too many people out there doing what I did who haven't rebooted / remounted in months...
Fully agree, sadly this is way too easy to trigger and can go undetected for a long time.
I am not sure if this is related.
But I cannot unlock one of my child datasets.
sudo zfs load-key secure2
Key load error: Key already loaded for 'secure2'.
sudo zfs load-key secure2/nextcloud2
Key load error: Keys must be loaded for encryption root of 'secure2/nextcloud2' (secure2).
I also cannot create or delete files in the root of my parent. In working dataset and in subfolders I have no issue:
root@silverrock[/mnt/secure2]# ls -la
total 35
drwxr-xr-x 5 root root 5 Sep 26 16:01 .
drwxr-xr-x 6 root root 6 Oct 10 06:04 ..
drwxr-xr-x 6 root root 6 Oct 10 13:47 charts
drwxr-xr-x 2 root root 2 Sep 26 16:01 ix-applications
drwxrwx--- 3 root www-data 4 Oct 10 13:53 nextcloud2
root@silverrock[/mnt/secure2]# touch test
touch: setting times of 'test': No such file or directory
root@silverrock[/mnt/secure2]# rm -rf nextcloud2
rm: cannot remove 'nextcloud2': Operation not permitted
Would you like to open a new follow-up issue, to triage your case independently of the specifics detailed out here? Then we can more cleanly find suitable answers to your questions.
Anyway I'm trying to find some initial answers from what you provided:
It appears fully legible that you cannot remove the ZFS dataset with rm, in case it is mounted at that path.
Can you show us the outputs of the following commands?
df -h /mnt/secure2 df -h /mnt/secure2/nextcloud2
They will show us which dataset/volume is currently mounted there.
Also some details about your dataset could help in debugging the issue.
zfs get all /mnt/secure2 zfs get all /mnt/secure2/nextcloud2
Then I can only think of the case, that you might have conducted a raw send and receive of the nextcloud2 dataset, with creating another secure2 dataset that uses a different IV from its origin.
I suggest everything else we discuss in your follow up issue.
On Mon, 10 Oct 2022 at 14:06, Alexander Weps @.***> wrote:
I am not sure if this is related.
But I cannot unlock one of my child datasets.
sudo zfs load-key secure2
Key load error: Key already loaded for 'secure2'.
sudo zfs load-key secure2/nextcloud2
Key load error: Keys must be loaded for encryption root of 'secure2/nextcloud2' (secure2).
Radomír Polách, [10.10.22 13:30] sudo zfs load-key secure2 Key load error: Key already loaded for 'secure2'.
Radomír Polách, [10.10.22 13:31] sudo zfs load-key secure2/nextcloud2 Key load error: Keys must be loaded for encryption root of 'secure2/nextcloud2' (secure2).
I also cannot create or delete files in the root of my parent. In working dataset and in subfolders I have no issue:
@.***[/mnt/secure2]# ls -la
total 35
drwxr-xr-x 5 root root 5 Sep 26 16:01 .
drwxr-xr-x 6 root root 6 Oct 10 06:04 ..
drwxr-xr-x 6 root root 6 Oct 10 13:47 charts
drwxr-xr-x 2 root root 2 Sep 26 16:01 ix-applications
drwxrwx--- 3 root www-data 4 Oct 10 13:53 nextcloud2
@.***[/mnt/secure2]# touch test
touch: setting times of 'test': No such file or directory
@.***[/mnt/secure2]# rm -rf nextcloud2
rm: cannot remove 'nextcloud2': Operation not permitted
— Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/issues/12000#issuecomment-1273214665, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMRV7HSWRSNPYYHCYVG6P3WCQBD3ANCNFSM44D5LHWQ . You are receiving this because you were mentioned.Message ID: @.***>
@almereyda I fixed the issue for my nextcloud2 dataset, but every new dataset has the same issue. I think the issue may be related to the fact, that I cannot write/remove any files and directories in the root of my parent dataset.
root@silverrock[~]# df -h /mnt/secure2
Filesystem Size Used Avail Use% Mounted on
secure2 758G 384K 758G 1% /mnt/secure2
root@silverrock[~]# df -h /mnt/secure2/nextcloud2
Filesystem Size Used Avail Use% Mounted on
secure2/nextcloud2 914G 157G 758G 18% /mnt/secure2/nextcloud2
So, my issue is currently with my parent dataset and how to repair, so my new datasets are created without issues and I can write files into it. I can create separate issue for this.
I have made a related issue: https://github.com/openzfs/zfs/issues/14011
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
As the original post was solved with #9309, we can close here and open questions continue in
Is this really fixed? I encountered this with OpenZFS v2.1
As of https://github.com/openzfs/zfs/commit/bb61cc31851098ab41a7dcb56333a90b40d65129 the patch that was communicated to solve this behaviour was available since ZFS v2.x.x.
Could you provide a small writeup on how you reproduced this error in your environment? I'm happy to reopen here, if we find more evidence that makes the case.
Yeah, I actually hit this again. Sender is an x86_64 system running zfs v2.2.99-365_g8f2f6cd2a and kernel v6.1.81 while receiver is an arm64 (rpi4) system running zfs v2.2.3 and kernel v6.6.12 (arch linux arm).
Reproducer:
sudo zfs send -w <set>@<snapshot0> | ssh <receiver> "doas zfs receive -uv <nonexistent_set>"
)sudo zfs send -w -v -I <set>@<snapshot0> <set>@<snapshot1> | ssh <receiver> "doas zfs receive -uv <newly_received_set>"
)I guess I'd need to send the updates as un-encrypted streams somehow. Any hints?
System information
Describe the problem you're observing
Following up from #6624 and #9273 it happens that a certain Long-Term Support release from Ubuntu that natively ships with ZFS 0.8.1 (meanwhile updated to 0.8.3) can still produce datasets that have a broken encryption hierarchy, despite all appears to be well at first.
In https://github.com/openzfs/zfs/issues/6624#issuecomment-328762569 @tcaputi speaks of:
Is it possible to apply this workaround in a way to restore the ability to mount the datasets? For this case, let's consider the source is not available anymore for retransmission. This is due to a juggling of datasets for downsizing a pool in a mirrored setup, where the source has been replaced by a smaller version of itself.
Two things are odd here:
encryptionroot
is a read-only value. How is it possible to re-set it?Describe how to reproduce the problem
Messages from the logs
Additional information
This is also discussed in:
Interestingly running
zfs load-key -r pool1
will not try to load the keys for the datasets for which keys are available (via their encryption rootpool1
), yet still they remain unmountable. Also some of the descendent encryption roots that Docker apparently created are decryptable, but others aren't. This gives hope, next to the key not being rejected for the other datasets, that all USERDATA is still available and recoverable.Some datasets decrypt, others don't, especially from Docker.
``` $ zpool export pool1 $ zpool import -l pool1 Enter passphrase for 'pool1': Enter passphrase for 'pool1/DOCKER/lib/697917a208c21f64616d09db164318acd0f20e4d1fffa7e4e66f8de95fa47bb2': Enter passphrase for 'pool1/DOCKER/lib/79944167f45fa8346086f0ff167a2d423bdb4f7d9ded72e9c401f64a15e49ac8': Enter passphrase for 'pool1/DOCKER/lib/33a48ff62d295e54f5d83675ad75b833f75713a7c3d7b56a89c69b08aceb9d0f': Enter passphrase for 'pool1/DOCKER/lib/1fb15fd5d836f0dbbe60645d532f4a13225704f790d64510e5785e841a6a9dfb': Enter passphrase for 'pool1/DOCKER/lib/f9bc3e5084268da84adc5a2e17e5f894dd1b3d7e5c8ac9bc7f947ce30c4c8a40': Enter passphrase for 'pool1/DOCKER/lib/130b413032e79c2a252a83dc7226f3f4106536074c2cc7c2729dc85e2959712a': Enter passphrase for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56'. Enter passphrase for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56'. Enter passphrase for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/eae4f3bd5bea07d25f88c7fc2ffd6324b595ea067162e6ed67d0c5f16097dd56'. Enter passphrase for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b'. Enter passphrase for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b'. Enter passphrase for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/683b04ac797d220ea673d705db7246e9254cb24f5f14171f251b3e45012cb93b'. Enter passphrase for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2'. Enter passphrase for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2'. Enter passphrase for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/291c9a86a0bfdcd8546cff5b6edf7805f6cd548f04ae799ffbea333faf18c8b2'. Enter passphrase for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368'. Enter passphrase for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368'. Enter passphrase for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/968ecf6144a9ac309982de243e23e0bc782f77fd68941248b5f3c827463e5368'. Enter passphrase for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb'. Enter passphrase for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb'. Enter passphrase for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/dc45835eb73741e53f87fc316342fae82a41f2c7421714ebefc6a2d5b04cebbb'. Enter passphrase for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc'. Enter passphrase for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc'. Enter passphrase for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/3071a1943c5d067abbfea06ff3721dd6aac6ff56c25ad83c9a65281a115326bc'. Enter passphrase for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9'. Enter passphrase for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9'. Enter passphrase for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9': Key load error: Incorrect key provided for 'pool1/DOCKER/lib/52bfa76983a9904e7e3cfbc911c1c2e0e92d69ee26467e46f002a8de614dc6d9'. 7 / 14 keys successfully loaded filesystem 'pool1/DOCKER/lib' can not be mounted: Permission denied cannot mount 'pool1/DOCKER/lib': Invalid argument filesystem 'pool1/USERDATA/bases' can not be mounted: Permission denied cannot mount 'pool1/USERDATA/bases': Invalid argument filesystem 'pool1/USERDATA/lake' can not be mounted: Permission denied cannot mount 'pool1/USERDATA/lake': Invalid argument filesystem 'pool1/BACKUP/bases' can not be mounted: Permission denied cannot mount 'pool1/BACKUP/bases': Invalid argument filesystem 'pool1/ROOT/system' can not be mounted: Permission denied cannot mount 'pool1/ROOT/system': Invalid argument filesystem 'pool1/ROOT/shell' can not be mounted: Permission denied cannot mount 'pool1/ROOT/shell': Invalid argument ```Further mentions of similar behaviours are:
-d
-F
and also replicates encryption roots distinctivelychange-key -i
It is not possible to set a new keylocation for dependant datasets in the encryption hierarchy.Following on from https://github.com/openzfs/zfs/issues/6847#issuecomment-342864902 also doesn't help in this case, where the encryption root has been replaced:
The pool itself does not report any data errors:
May it be possible to reintroduce a new encryption root for those datasets that don't mount currently, eventually by decomposing and reconstructing the pool into two and back again, or are there any other known workarounds I am not aware of, yet?
Thank you for your kind help, our users will appreciate it.
Reflection and conclusion
Reading the linked references brings up their reasoning and questioning again:
Which other good practices are known to omit the side effects and edge cases we produce here, other than creating independently encrypted datasets under the root dataset of the pool that act as encryption roots for their descendants, and leaving that itself unencrypted?
When trying to do
send | recv -x encryption
, the command output complains about a missing raw flag, while we have migrated datasets from unencrypted pools into encrypted ones before. Now this looks even more bestranging:Is the source key actually available and loaded, or is it not? From the given output of the commands, for me it is not possible to depict anymore.
Reading through zfs-receive.8 makes me guess that we now probably have different initialization vectors (IV) for the AEAD cipher in the
pool1
dataset and its descendants, why recreating the encrypted zpool with the same passphrase does not mean it can act as an encryption root for its newly retrieved childs.