oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
241 stars 36 forks source link

Add garbage collection for expunged sleds and zones #5552

Open sunshowers opened 4 months ago

sunshowers commented 4 months ago

Specifically, the blueprint planner needs to:

This should likely be done both at construction time against the parent blueprint, and incrementally.

Resources include:

We've deferred this for release 8 but would like to do this for release 9.

sunshowers commented 4 months ago

Some half-finished code which needs to be updated to take #5541 etc into account:

Click to expand ```rust /// Builds a map of resources that either the planning input or the parent /// blueprint says are in-use. fn build_external_in_use_map( parent_blueprint: &Blueprint, input_map: &BTreeMap, accessor: impl Fn(&OmicronZoneConfig) -> Option, ) -> Result, Error> { // The basic table is, for each resource: // // ``` // in blueprint? in input_map? result // yes, reachable yes in use // yes, non-reachable yes in use [1] // no yes in use, warn [2] // yes, reachable no in use [3] // yes, non-reachable no not in use // no no not in use [4] // ``` // // "in blueprint?" means whether that resource is allocated to a zone // that's described in the blueprint (desired state). "reachable" is // defined by the `ShouldBeExternallyReachable` filter. // // "in input_map?" means whether that resource is allocated to a zone // that's described in the planning input (actual state). // // [1]: This is a zone that the executor will aim to remove external // resources for. However, it is *currently* in use, and we can't just // pave over and reallocate its resources. // [2]: This should never happen. The blueprint is meant to be a complete // description of the state. This is a bug -- warn about it (and maybe // consider erroring out?) // [3]: This is a zone that the executor will aim to add external resources // for. We consider it to be in use. // [4]: This resource will never be considered in the below code. // // There are also some error cases: // // 1. Two zones in the input map are using the same resource. // 2. Two zones in the blueprint are using the same resource. // 3. For the same zone, the blueprint and the input map disagree on what // resource is allocated to them. // // Case 3 is potentially migration of a zone to a different resource -- // this is not supported at the moment. // // So, how do we go about actually implementing this table? We'll perform // two loops: one over the input map and one over the parent blueprint. let mut in_use: HashMap = HashMap::new(); for (zone_id, resource) in input_map { let details = in_use.entry(resource.clone()).or_default(); if details.in_input_map { // Error case 1: two zones in the input map are using the same // resource. return Err(Error::Planner(anyhow!( "in input map, zone {} and zone {} are using the same resource: {:?}", zone_id, zone_id, resource, ))); } in_use.insert( resource.clone(), ResourceInUseDetails { in_input_map: true, allocated_to_zone: None, }, ); } Ok(in_use) } /// Describes why a resource is currently marked in use. /// /// The `Default` is only for construction -- the in-use map should always have /// non-default values. #[derive(Debug, Default)] struct ResourceInUseDetails { in_input_map: bool, allocated_to_zone: OmicronZoneUuid, } ```

And some test code:

Click to expand ```rust // Create a new blueprint that starts from bp2. Ensure that the new // builder's idea of which resources are used has been updated to // reflect that the corresponding zones are now expunged. let planner3 = Planner::new_based_on( logctx.log.clone(), &blueprint2, &input, "test_blueprint3", &collection, ) .expect("planner created") .with_rng_seed((TEST_NAME, "bp3")); // Ensure that the expunged and decommissioned sled's allocated // resources are no longer in use. let nexus_v4_in_use = planner3.blueprint.nexus_v4_ips.in_use(); let nexus_v6_in_use = planner3.blueprint.nexus_v6_ips.in_use(); let external_ips_in_use = planner3.blueprint.available_external_ips.in_use(); let system_macs_in_use = planner3.blueprint.available_system_macs.in_use(); for (sled_id, zone) in blueprint2.all_omicron_zones_typed(BlueprintZoneFilter::All) { // TODO: also verify quiesced zones here. let desc = if sled_id == expunged_sled_id { "expunged sled" } else if sled_id == decommissioned_sled_id { "decommissioned sled" } else { continue; }; // For the decommissioned sled, ensure that IPs are no longer in // use. if let OmicronZoneType::Nexus { nic, .. } = &zone.zone_type { match nic.ip { IpAddr::V4(ip) => { assert!( !nexus_v4_in_use.contains(&ip), "for {desc}, Nexus V4 IP {ip} should not be in use" ); } IpAddr::V6(ip) => { assert!( !nexus_v6_in_use.contains(&ip), "for {desc}, Nexus V6 IP {ip} should not be in use" ); } } } if let Some(external_ip) = zone.zone_type.external_ip() { assert!( !external_ips_in_use.contains(&external_ip), "for {desc}, external IP {external_ip} should not be in use" ); } if let Some(nic) = zone.zone_type.opte_vnic() { assert!( !system_macs_in_use.contains(&nic.mac), "for {desc}, system MAC {} should not be in use", nic.mac ); } } logctx.cleanup_successful(); ```
davepacheco commented 1 week ago

I think part of why we deferred this was that we weren't 100% sure what cleanup needed to happen before we could forget about an expunged zone. Some stuff that's come up: