oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
239 stars 34 forks source link

Sled Agent confused by mismatched state between ZFS and /var/oxide #1885

Open davepacheco opened 1 year ago

davepacheco commented 1 year ago

During #1880 we ran into a case where the Sled Agent discovered ZFS datasets with the "oxide:uuid" property set and attempted to create new Clickhouse and CockroachDB zones using them. This is expected behavior when the system has restarted (and, for now, when sled agent restarts). The problem in this case is that Sled Agent also needs metadata that's stored in /var/oxide, but that was missing.

Relevant log entries:

04:54:04.634Z  INFO SledAgent/StorageManager: StorageWorker loading fs cockroachdb on zpool oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b
04:54:04.681Z  INFO SledAgent/StorageManager: Loading Dataset from /var/oxide/d462a7f7-b628-40fe-80ff-4e4189e2d62b/d6b41137-7fb5-43a1-939a-08302ddfc95e.toml
04:54:04.696Z  WARN SledAgent/StorageManager: StorageWorker Failed to load dataset: Failed to perform I/O: read config for pool oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b, dataset DatasetName { pool_name: "oxp_d462a7f7-b628-40fe-80ff-4e4189e2d62b", dataset_name: "cockroachdb" } from "/var/oxide/d462a7f7-b628-40fe-80ff-4e4189e2d62b/d6b41137-7fb5-43a1-939a-08302ddfc95e.toml": No such file or directory (os error 2)

As I understand it:

  1. when RSS asks Sled Agent to create a dataset for Clickhouse or CockroachDB, we create a ZFS dataset, assign it a uuid, and set the oxide:uuid property on the dataset to the value we generated
  2. then we do more dataset/zone initialization stuff
  3. then we record additional metadata about the zone (like its IP address?) into /var/oxide/$pool_uuid/$dataset_uuid

We identified at least two ways we can wind up with a dataset with nothing in /var/oxide for it:

This should not come up in production, but if it ever does, presumably we want to flag this as an issue requiring Oxide support and otherwise ignore the dataset.


Tangentially: if we know all the metadata that we need at the time we create the ZFS dataset, we could store that into ZFS properties and we can set those properties when the dataset is created. This way, there would never be a time that the dataset exists and the metadata doesn't. (This is a primary use case for ZFS user properties.) We might still have the problem that this dataset is essentially invalid -- see #1884. If we went down the path of replacing the /var/oxide metadata with ZFS user properties, we'd need some other way to identify when the dataset is from a previous install. Maybe tag each one with a uuid that's created when RSS determines the initial plan, then only consider datasets with the expected uuid? (Feel free to ignore all this if it doesn't seem like it'll simplify things.)

jclulow commented 1 year ago

If ZFS user properties are not sufficient, we could presumably also house a file in the root of the dataset itself. It would also be worth assessing how properties interact with encryption (are they encrypted or are they always visible in plaintext even before the keys have been provided to unlock the dataset etc).