openzfs / openzfs-docs

OpenZFS Documentation
https://openzfs.github.io/openzfs-docs/
135 stars 194 forks source link

reconsider pool-level native encryption #354

Open anarcat opened 2 years ago

anarcat commented 2 years ago

The instructions provided at least in the Debian side of the world, include something like this:

zpool create \
    -o ashift=12 \
    -o autotrim=on \
    -O encryption=on -O keylocation=prompt -O keyformat=passphrase \
    -O acltype=posixacl -O xattr=sa -O dnodesize=auto \
    -O compression=lz4 \
    -O normalization=formD \
    -O relatime=on \
    -O canmount=off -O mountpoint=/ -R /mnt \
    rpool ${DISK}-part4

From what I understand, what the above does is it creates a pool named rpool and its associated rpool dataset, with native encryption. This is pretty nice, but it has significant caveats in my experience. Most notable, it makes it hard (or impossible?) to send/receive with --replicate (-R). This was a surprise to me: the guide mentions nowhere this kind of limitation, and it's a rather big one because many things don't work as expected.

For example, after installing a server following the procedure, I wanted to move between HDD and (smaller) SDD disks. Normally, I should have been able to just create a new pool, and zfs send/receive the data between the two. But that doesn't work:

# zfs snapshot -r rpool@shrink &&
    zfs send -vR rpool@shrink | zfs receive -vFd rpoolssd
cannot send rpool@shrink: encrypted dataset rpool may not be sent with properties without the raw flag

Now yes, I could pass the --raw flag here, but it doesn't work as one would expect: what that would do would be to recreate encrypted datasets under the encrypted pool, essentially double-encrypting the datasets. The resulting data, from what I can tell, is not something I can readily open. Maybe there's a way to stream it back to the original pool, but I haven't actually figured out how.

I ended up doing something like this:

for ds in $(zfs list -H -o name | grep ^rpool/) ; do zfs send  $ds@shrink | zfs receive -vd rpoolssd; done

... but that feels rather clunky and weird. I think setting up encryption at the dataset (as opposed to "root dataset") level might be more appropriate, and maybe this is something the guide should suggest as well.

At the very least, the guide should certainly discuss the limitations of ZFS native encryption. Those are also discussed in zfs-load-key but that's kind of buried out there.

The FAQ doesn't mention anything in that regard either.

Would you be open to a PR that would start moving encryption downwards in the datasets?

thanks

anarcat commented 2 years ago

I have also just noticed the dataset layout is slightly inconsistent across OSes... in Debian, you have rpool/var rpool/home etc, but in fedora, you havr rpool/redhat/var, for example. Arch is like Fedora, but Ubuntu is like Debian...

ghost commented 2 years ago

They are consistent across distros. The "redhat", "fedora", "arch" are used to identify which kind of distro they are, and encryption is enabled on those distro datasets instead of the entire pool.

It also supports installing multiple distros on the same pool, you would have rpool/debian rpool/devuan rpool/nixos rpool/redhat, for example.

I discussed the limitations of encroot=rpool at here. https://github.com/openzfs/openzfs-docs/pull/127#issuecomment-785491091

anarcat @.***> writes:

I have also just noticed the dataset layout is slightly inconsistent across OSes... in Debian, you have rpool/var rpool/home etc, but in fedora, you havr rpool/redhat/var, for example.

-- Reply to this email directly or view it on GitHub: https://github.com/openzfs/openzfs-docs/issues/354#issuecomment-1280024281 You are receiving this because you are subscribed to this thread.

Message ID: @.***>

anarcat commented 2 years ago

They are consistent across distros. The "redhat", "fedora", "arch" are used to identify which kind of distro they are, and encryption is enabled on those distro datasets instead of the entire pool.

I think you're incorrect. In the Debian and Ubuntu, it's not like this: the entire rpool is encrypted.

I discussed the limitations of encroot=rpool at here. #127 (comment)

thanks, that's interesting.

ghost commented 2 years ago

I'm the maintainer of arch linux, nixos, fedora, rhel and alpine linux guides. They are consistent in dataset encryption as I mentioned above.

debian, ubuntu are maintained by rlaager and opensuse seems orphaned at the moment.

anarcat commented 2 years ago

On 2022-10-17 12:02:20, ne9z wrote:

I'm the maintainer of arch linux, nixos, fedora, rhel and alpine linux guides. They are consistent in dataset encryption as I mentioned above.

Right. I didn't mean to imply those were inconsistent internally. Sorry if it came out that way.

debian, ubuntu are maintained by rlaager and opensuse seems orphaned at the moment.

... and those are the ones I'm suggesting we should reconsider. :)

rlaager commented 2 years ago

Let's keep this issue focused on the merits of encrypting the whole pool. If you want to discuss cross-distro layout consistency, file a separate issue. (But also keep in mind that the Ubuntu HOWTOs already need to change to drop all the zsys stuff, which will bring Ubuntu and Debian in sync again.)

rlaager commented 2 years ago

I tested this with encryption at the root and (only) on a child. I get exactly the same result:

# zfs send -vR separateold/child@shrink | zfs receive -vFd separatenew
cannot send separateold/child@shrink: encrypted dataset separateold/child may not be sent with properties without the raw flag
cannot receive: failed to read from stream

So I'm not seeing how doing the encryption one level down provides any benefit on this particular concern.

I'm certainly no fan of the fact that -R is effectively useless with encryption. But that's not the fault of the root-on-ZFS HOWTOs.

I scripted around that limitation for my replication. My script reads the properties from the source and sets them on the destination, like this:

zfs send -c -L -n -v old@shrink| zfs recv -F -v -o mountpoint=/old -o compression=lz4 -o canmount=off -o xattr=sa -o dnodesize=auto -o acltype=posixacl -o relatime=on  new

(That mountpoint=/old is an artifact of this being a test pool on a running system.)

anarcat commented 2 years ago

On 2022-10-17 13:47:35, Richard Laager wrote:

Let's keep this issue focused on the merits of encrypting the whole pool.

Sure.

For what it's worth, I don't have an opinion for or against, I just found myself in a bind, in a situation where the whole pool encryption is hurting me pretty bad...

If you want to discuss cross-distro layout consistency, file a separate issue. (But also keep in mind that the Ubuntu HOWTOs already need to change to drop all the zsys stuff, which will bring Ubuntu and Debian in sync again.)

well, i filed this issue about that inconsistency i guess. ;)

-- If you have come here to help me, you are wasting our time. But if you have come because your liberation is bound up with mine, then let us work together. - Aboriginal activists group, Queensland, 1970s

rlaager commented 2 years ago

I just found myself in a bind, in a situation where the whole pool encryption is hurting me pretty bad...

I still don't understand how having the encryption on "rpool/debian" vs "rpool" would make a difference. zfs send -R still wouldn't work. Can you explain more about how you think that a separate dataset would help? Is --raw somehow involved in the proposed solution?

anarcat commented 2 years ago

On 2022-10-17 23:27:16, Richard Laager wrote:

I just found myself in a bind, in a situation where the whole pool encryption is hurting me pretty bad...

I still don't understand how having the encryption on "rpool/debian" vs "rpool" would make a difference. zfs send -R still wouldn't work. Can you explain more about how you think that a separate dataset would help? Is --raw somehow involved in the proposed solution?

From what I understand, the problem is not as much with -R as with -F on the receive side, as the receiver is not allowed to delete the root dataset.

Does that make sense?

rlaager commented 2 years ago

You make a good point about zfs recv -F and the root dataset.

I am able to reproduce such a failure with:

# zfs send -vR --raw old@shrink | zfs receive -vFd new
full send of old@shrink estimated size is 69.6K
full send of old/srv@shrink estimated size is 70.1K
total estimated size is 140K
TIME        SENT   SNAPSHOT old@shrink
cannot receive new filesystem stream: zfs receive -F cannot be used to destroy an encrypted filesystem or overwrite an unencrypted one with an encrypted one
TIME        SENT   SNAPSHOT old/srv@shrink

But, there's no real data in the root dataset.

I did this:

# zfs send -vR --raw old/srv@shrink | zfs receive -vFd new
full send of old/srv@shrink estimated size is 70.1K
total estimated size is 70.1K
TIME        SENT   SNAPSHOT old/srv@shrink
receiving full stream of old/srv@shrink into new/srv@shrink
received 25.6K stream in 1 seconds (25.6K/sec)
filesystem 'new/srv' can not be mounted: Permission denied
cannot mount 'new/srv': Invalid argument
# zfs send -vR --raw old/ROOT@shrink | zfs receive -vFd new
full send of old/ROOT@shrink estimated size is 69.6K
full send of old/ROOT/ubuntu@shrink estimated size is 69.6K
total estimated size is 139K
TIME        SENT   SNAPSHOT old/ROOT@shrink
receiving full stream of old/ROOT@shrink into new/ROOT@shrink
TIME        SENT   SNAPSHOT old/ROOT/ubuntu@shrink
received 23.1K stream in 1 seconds (23.1K/sec)
receiving full stream of old/ROOT/ubuntu@shrink into new/ROOT/ubuntu@shrink
received 23.1K stream in 1 seconds (23.1K/sec)
filesystem 'new/ROOT/ubuntu' can not be mounted: Permission denied
cannot mount 'new/ROOT/ubuntu': Invalid argument

That gets the datasets over. But they failing to mount???

I exported and re-imported the pool. They're still failing. They show an encryptionroot of "new", but I can't get it to load the key:

# zfs mount new/srv
filesystem 'new/srv' can not be mounted: Permission denied
cannot mount 'new/srv': Invalid argument
# zfs load-key new/srv
Key load error: Keys must be loaded for encryption root of 'new/srv' (new).
# zfs load-key new
Key load error: Key already loaded for 'new'.
# zfs unload-key new
# zfs load-key new/srv
Key load error: Keys must be loaded for encryption root of 'new/srv' (new).

It feels like the raw send got the data over, but the encryptionroot is messed up in some way! So...that's really bad! It should have worked, with an encryptionroot of either "new/srv" (what it sounds like you expected, and what I think I expected), or "new"; being broken is definitely not okay.

So I definitely have to say that the current situation is bad overall. If the encryptionroot was not the top of the pool, that would definitely be safer.