Clients that use BlueprintBuilder, not the planner, that fail to call sled_ensure_disks / sled_ensure_datasets at appropriate times will emit blueprints that contain zones that reference datasets / disks that one would reasonably expect to exist in their relative maps, but those maps may be out of date or empty. Empty maps happen to work today because for backwards compatibility reasons we allow them to be empty, but doing so is a little messy and will eventually be removed once production systems can all be expected to have populated maps. That will break any of these clients (tests and reconfigurator-cli, today).
I think (?) we should probably only have one map here, keyed by sled ID, with values that encompass all the blueprint details for a single sled (today: state + zones + disks + datasets, growing in the future to support update versioning as needed). This may require some fundamental rework of BlueprintBuilder and possibly the planner, as today they manage these maps mostly independently (which is problematic!).
Blueprint
currently has four different maps, all keyed by sled ID: https://github.com/oxidecomputer/omicron/blob/7cf372d7aca20d8cbfd18739a1d4cad37dda034a/nexus/types/src/deployment.rs#L144-L163In general we would expect all those maps to have the same keys, but in practice that isn't true. A couple examples:
sled_state
drops decommissioned sleds, but they may still exist in the other maps (but should only contain expunged values). We have to work around this when diff'ing blueprints.BlueprintBuilder
, not the planner, that fail to callsled_ensure_disks
/sled_ensure_datasets
at appropriate times will emit blueprints that contain zones that reference datasets / disks that one would reasonably expect to exist in their relative maps, but those maps may be out of date or empty. Empty maps happen to work today because for backwards compatibility reasons we allow them to be empty, but doing so is a little messy and will eventually be removed once production systems can all be expected to have populated maps. That will break any of these clients (tests and reconfigurator-cli, today).I think (?) we should probably only have one map here, keyed by sled ID, with values that encompass all the blueprint details for a single sled (today: state + zones + disks + datasets, growing in the future to support update versioning as needed). This may require some fundamental rework of
BlueprintBuilder
and possibly the planner, as today they manage these maps mostly independently (which is problematic!).