Open concussious opened 1 week ago
As I told before, I don't like idea of that kind of public comments not describing anything and not leading to resolution. But that is my personal position and I let others decide.
You are much more knowledgeable than me, but imho this is very clearly and conservatively documenting a specific widely reported and known cross platform issue. The purpose of this type of doc is so that it doesn't become a footgun until it can be fixed.
I would be very grateful for any advice to improve what I'm trying to do without creating uncertainty.
I appreciate what you're trying to do, and I'm not against it in principle. As written though, I don't think this is accurate or helpful.
As far as I'm aware, there are no know issues with encrypted snapshots as such. If you snapshot an encrypted dataset, it works as expected: it can be cloned, rolled back to, read, and sent.
All "known" problems are around raw receive itself, or later uses of snapshots that were create via raw receive. I say "known" here because the things that we suspect still exist have been difficult or impossible to reproduce reliably enough in a lab environment where they can then be studied. Of the ones I know about (eg #12014), the difficulty is that the problem likely occurs when the stream is received, but isn't noticed until much later. So any reproducer is going to rely on a sequence of events.
Sometimes we get a user who can reproduce it reliably and is willing to help, which is a wonderful thing, but also means having to guide them through an often-complicated and always-dangerous debugging process (they usually have to crash their pool a lot, which is not kind to data). This work is extremely time consuming (== money) and rarely yields results.
The fact is, as best anyone can tell, encryption seems to work pretty well for most people most of the time, which is why I'm wary of an ambiguous warning against its use, in part or in full. Any remaining problems are only going to be solved with more eyeballs on the problem. If we're going document anything, I would like it to be clear about where and what kinds of problems may arise, where we believe it's good, and call for help.
Thank you so much for the detailed explanation. I'm fairly new to bug triage and doc, and doing this because I love this amazing software, so I certainly don't want to create uncertainty and doubt. Is the new revision better?
… Is the new revision better?
From FreeBSD Discord (2024-11-12 19:47, before creation of the Bug 282622 thread):
I'm not convinced that zfs-load-key.8 is a suitable page to mention the issue(s).
More specifically, my thought at the time was similar to this part of what I later found at https://github.com/openzfs/openzfs-docs/issues/494#issue-2129666963 (2024-02-17), with added emphasis:
Should warnings be added to the sections of the documentation and/or the zfs command itself that mention native encryption that this combination of features (native encryption + send/recv) …
With a manual page mindset, the primary place that I'd expect a brief outline of the issue (openzfs/openzfs-docs 494) would be:
#BUGS
anchor to the not yet existent Bugs section of the page); or For conciseness in this PR, I'll add some explanation to:
Typically, known issues are documented in release notes. That might make more sense than documentation.
There is a known issue reported on Linux and FreeBSD whereby snapshots experience runtime corruption when using ZFS native encryption. This filesystem is far too reliable for that to be a surprise until it can be fixed.
OpenZFS bug: #12014 FreeBSD bug: #282622 Signed-off-by: Alexander Ziaee ziaee@google.com Co-authored-by: Lexi Winter lexi@le-fay.org
Cc @amotin @mmatuska, thanks.
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.