Open davepacheco opened 1 year ago
If ZFS user properties are not sufficient, we could presumably also house a file in the root of the dataset itself. It would also be worth assessing how properties interact with encryption (are they encrypted or are they always visible in plaintext even before the keys have been provided to unlock the dataset etc).
During #1880 we ran into a case where the Sled Agent discovered ZFS datasets with the "oxide:uuid" property set and attempted to create new Clickhouse and CockroachDB zones using them. This is expected behavior when the system has restarted (and, for now, when sled agent restarts). The problem in this case is that Sled Agent also needs metadata that's stored in /var/oxide, but that was missing.
Relevant log entries:
As I understand it:
/var/oxide/$pool_uuid/$dataset_uuid
We identified at least two ways we can wind up with a dataset with nothing in /var/oxide for it:
This should not come up in production, but if it ever does, presumably we want to flag this as an issue requiring Oxide support and otherwise ignore the dataset.
Tangentially: if we know all the metadata that we need at the time we create the ZFS dataset, we could store that into ZFS properties and we can set those properties when the dataset is created. This way, there would never be a time that the dataset exists and the metadata doesn't. (This is a primary use case for ZFS user properties.) We might still have the problem that this dataset is essentially invalid -- see #1884. If we went down the path of replacing the /var/oxide metadata with ZFS user properties, we'd need some other way to identify when the dataset is from a previous install. Maybe tag each one with a uuid that's created when RSS determines the initial plan, then only consider datasets with the expected uuid? (Feel free to ignore all this if it doesn't seem like it'll simplify things.)