oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
252 stars 40 forks source link

[reconfigurator] Planner can give out the same external IP to two different new zones #7049

Closed jgallagher closed 1 week ago

jgallagher commented 1 week ago

The planner builds an ExternalIpAllocator by initially providing the set of all service IP pool ranges, and then later marking any IPs used by current services as in use. It also marks any new IPs it hands out as in use as new zones are added.

Independently, the planner builds up a list of available external DNS IP addresses by checking for all external DNS IPs that aren't currently in use. This is a necessary but awkward workaround for our current state where the set of external DNS IPs are specified at RSS time and implicitly carried forward via blueprints, but not explicitly listed in CRDB or the policy.

If we are in a state where an external DNS zone has been expunged, both of these independent IP allocators will believe that IP is free:

If in one planning iteration we attempt to add an external DNS zone and another zone that needs an external IP (e.g., Nexus), both new zones could be given the same IP, one from each independent allocator.

The easy (and correct!) fix is to always remove the set of external DNS IPs from ExternalIpAllocator's possible IPs. At RSS time we require the external DNS IPs to be a subset of the service IP pools, but they should be reserved for external DNS exclusively, and not available for other services.

Thanks @askfongjojo for running face first into this while testing sled replacement!