oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
251 stars 40 forks source link

Want mechanisms for specifying external addresses that are allowed to talk to rack services #5640

Closed bnaecker closed 6 months ago

bnaecker commented 6 months ago

Most of the external rack networking is currently geared towards customers that have private IP ranges. For example, consider Nexus or external DNS. These services listen on an IP external to the rack, as opposed to the underlay, relying on OPTE and the switch to manage tunneling that traffic through the rack. We've sort of been assuming that the external IP customers give us for these services is in some private subnet, and that the customer themselves controls any program that can reach our services from that subnet.

However, we now have customers who have no on-premises infrastructure of their own to speak of, relying entirely on cloud vendors. They have no private IP ranges that they can delegate to us. Instead, they have a completely public IP space on the open Internet, within which we'll need to run Oxide services like Nexus.

To make things slightly more secure, customers will want to limit the degree to which any old program on the public Internet can talk to these rack services. Specifically, we need a way of specifying an allowlist of the source IPs that can make requests to Nexus or external DNS. That way, customers can set up a firewall-like policy for limiting traffic to only sources they control.

This work is pretty big, so this issue will be updated and extended as we get into implementation. But here are some initial pieces we'll need:

bnaecker commented 6 months ago

Update time.

I've got the API mostly in place thanks to @ahl's help with schemars and friends. At the end of the day, we'll have a CLI interface like:

$ oxide system networking allowed-source-ips view 
{ "allow" : "any" }
$ oxide system networking allowed-source-ips update --allow list --ips ["1.2.3.4", "5.6.7.8/9"]
$ oxide system networking allowed-source-ips view 
{ "allow" : "list", "ips" : ["1.2.3.4", "5.6.7.8/9"] }

I spent some time with @andrewjstone and @jgallagher talking bootstore, and I think we can avoid this ever showing up there at all. They had the great point that Nexus is really two servers: an internal and external. This list only applies to the external, so we need it in place before that server starts. However, we have just such a convenient point here:

https://github.com/oxidecomputer/omicron/blob/e810f2e0590d06a4593fca8d191a201442945b9c/nexus/src/lib.rs#L423-L424

At the point we start the first server, we have all the information we need to form the OPTE firewall rules that implement this policy. That will come from RSS -> sled-agent bootstrap server -> internal Nexus server -> CRDB. At which point, before we start the external Dropshot server itself, ask the sled-agent to insert our own OPTE firewall rules that implement this policy on the relevant ports.

We may actually want to do this here:

https://github.com/oxidecomputer/omicron/blob/e810f2e0590d06a4593fca8d191a201442945b9c/nexus/src/lib.rs#L136-L144

specifically right after the call to await_rack_initialization(). There, we're sure to have all the data in CRDB (well, I can make sure we do...), so Nexus can form the required rules and send them to their "managing" sled-agents for plumbing.

bnaecker commented 6 months ago

Closed by #5686