firezone / firezone

WireGuard®-based zero-trust access platform with OIDC auth, identity sync, and NAT traversal.
https://www.firezone.dev
Apache License 2.0
6.44k stars 271 forks source link

refactor(connlib): tidy-up tech-debt around mangling of forwarded DNS queries #5391

Open thomaseizinger opened 2 weeks ago

thomaseizinger commented 2 weeks ago

With #5049, we are more explicitly modelling connlib's DNS handling using a StubResolver. The StubResolver looks at every incoming packet and, in case it is a DNS query for one of our resources, send a DNS response.

In case it isn't for one of our DNS resources, it doesn't touch the packet but instead emits a ResolveStrategy::ForwardQuery. Forwarding a query means we need to contact another DNS server to resolve it.

  1. In case the desired DNS server is a CIDR resource, we need to tunnel the query and thus treat it like any other IP packet (checking if we are connected to the corresponding gateway and if not, drop the packet and send a connection intent to the portal).
  2. In case it isn't a CIDR resource, we use the hickory-dns library to contact the original DNS server (i.e. the one that is configured either on the system or in the portal).

For historical reasons, connlib models this forwarding a bit awkwardly:

In the first case, the query packet needs to be mangled to point to the actual CIDR resource. However, this isn't done by StubResolver but instead happens only once we've already selected the gateway: https://github.com/firezone/firezone/blob/cba64355a42b7a2ae74313d30c9fc4741b33dbd0/rust/connlib/tunnel/src/client.rs#L426-L431

In the second case, we don't use the original IpPacket for much other than selecting, which upstream DNS server to contact: https://github.com/firezone/firezone/blob/cba64355a42b7a2ae74313d30c9fc4741b33dbd0/rust/connlib/tunnel/src/io.rs#L130-L135

This is the same logic as mangling the packet for case 1! If StubResolver would mangle the IpPacket right away, Io would not need to know about the mapping of proxy IPs -> upstream servers, improving the encapsulation of ClientState.

Additionally, this would allow moving the state tracking of mangled_dns_queries into StubResolver, further simplifying ClientState. If StubResolver does the mangling, it should also own the state created as part of it. As a consequence, StubResolver will also need to know about responses coming back either via the tunnel (from DNS servers that are CIDR resources) or from hickory (for DNS servers that are not). This makes sense because a stub resolver sends and receives its own queries (to upstream DNS servers). To include this in connlib's model, StubResolver could adopt a similar API as snownet::Node and ClientOnGateway where a packets are inbound and outbound packets are transformed. Those are currently called encapsulate and decapsulate although that naming isn't quite as good for StubResolver.

As a final, very intertwined piece, dns_mapping should be moved into StubResolver. Conceptually, these are the DNS servers that the StubResolver will contact in case it cannot answer the query itself (because the query isn't for one of our DNS resources). This state should be owned by StubResolver and not ClientState.

jamilbk commented 1 week ago

Ah, yes -- also refs #5491