tlswg / draft-ietf-tls-esni

TLS Encrypted Client Hello
https://tlswg.github.io/draft-ietf-tls-esni/#go.draft-ietf-tls-esni.html
Other
230 stars 58 forks source link

Associate the ESNI record with the client-facing server IP #139

Closed jb-wisemo closed 4 years ago

jb-wisemo commented 5 years ago

To maximize the cover range for clients and simplify management of the ESNI records, it is more useful to associate the ESNI records with the server rather than the query domain name. This has the additional benefit of allowing different client-facing servers to use different rotating private DH keys.

This eliminates the issues with CNAMEs, individual query domain holders having to make changes, DNS namespace issues etc.

For example a large hoster wishing to provide cover for some clients by having their primary / CDN front ends offer ESNI for all requests can then simply enable the code on their server then add the ESNI records. They can even do this gradually over a large pool of servers, thus having some servers with different (or no) ESNI support during each careful rollout of ESNI software upgrades.

In practice the following conventions would be used:

  1. Any ESNI server IP must have exactly one PTR record (typically pointing to the hoster domain, such as a typical systematic CDN node DNS name). Lets call this name ptrnam.dom

  2. The ESNI records would be under _esni.ptrnam.dom

  3. As with most PTR records ptrnam.dom is expected to have exactly one AAAA or A record pointing back at the same IP (if the starting IP is IPv6 only AAAA need to satisfy this, if IPv4 only A, the same ptrnam.dom can be used for one IPv4 and one IPv6 address if the ESNI records would be the same).

  4. For address-less server types such as tor hidden services, the ESNI records would be published via whatever DNS or DNS-like mechanism is used for that transport, or queried by sending a DNS query for the record to the target, allowing the target to respond within that transport.

Benefits:

Downsides:

mcmanus commented 5 years ago

This has been kicked around before and generally not selected based on the downsides you identify - prinicipally the serialization issue as well as practical matters of control of PTR records. It also prohibits the use of more than 1 key on an address - that's obviously a bad thing for the anonymity pool at its extreme, but some architectures might require more than one pool.

You can fix the latter two issues by using another suggestion that has previously been made: connect to the server and validate an IP based certificate and then get the ESNI key via a REST API (or equivalent) - but that is still likely too slow and comes with the challenges of IP based certificate management.

On Fri, Mar 1, 2019 at 1:35 PM jb-wisemo notifications@github.com wrote:

To maximize the cover range for clients and simplify management of the ESNI records, it is more useful to associate the ESNI records with the server rather than the query domain name. This has the additional benefit of allowing different client-facing servers to use different rotating private DH keys.

This eliminates the issues with CNAMEs, individual query domain holders having to make changes, DNS namespace issues etc.

For example a large hoster wishing to provide cover for some clients by having their primary / CDN front ends offer ESNI for all requests can then simply enable the code on their server then add the ESNI records. They can even do this gradually over a large pool of servers, thus having some servers with different (or no) ESNI support during each careful rollout of ESNI software upgrades.

In practice the following conventions would be used:

-

  1. Any ESNI server IP must have exactly one PTR record (typically pointing to the hoster domain, such as a typical systematic CDN node DNS name). Lets call this name ptrnam.dom

  2. The ESNI records would be under _esni.ptrnam.dom

  3. As with most PTR records ptrnam.dom is expected to have exactly one AAAA or A record pointing back at the same IP (if the starting IP is IPv6 only AAAA need to satisfy this, if IPv4 only A, the same ptrnam.dom can be used for one IPv4 and one IPv6 address if the ESNI records would be the same).

  4. For address-less server types such as tor hidden services, the ESNI records would be published via whatever DNS or DNS-like mechanism is used for that transport, or queried by sending a DNS query for the record to the target, allowing the target to respond within that transport.

Benefits:

-

B1. Large client-facing servers such as CDNs and shared IP hosters can deploy ESNI without having to negotiate DNS changes with all their customers (in fact with any of their customers).

B2. Operators of large pools of client-facing servers such as CDNs can roll out ESNI support and ESNI implementation upgrades in a piecemeal and controlled manner similar to how other system updates are rolled out. Simply because each IP address will have its own ESNI records.

B3. ESNI key rotation becomes much simpler as there is no need to synchronize it between different servers in a pool (except where a pure layer 1/2/3 load balancer shares a single IP among multiple physical/virtual servers, such as the layer 2 HALinux scheme).

B4. ESNI private key protection can be much stronger when it is not shared among all the servers. The private key may be kept in protected hardware and/or in volatile RAM. The latter case would require clients to fall back to a different A or AAAA record until the DNS entries for now forgotten keys to have expired, but only in case of a server crash or similar event.

B5. Client queries for ESNI records do not reveal the actual query domain that will be encrypted, only the IP address to which the ESNI extension will be sent.

B6. The whole discussion about _esni prefix or no _esni prefix becomes a lot simpler when there is no need to put ESNI records in every end query domain, only domains of the client-facing servers, allowing to keep the unprefixed TXT name space clean for existing uses such as SPF.

Downsides:

-

P1: Clients will have to postpone the _esni DNS query until after receiving the answer to the A or AAAA query, thus adding an extra DNS roundtrip before the first connection to a name. The usual DNS caching hierarchy should mitigate this. In particular for popular CDNs, a client will typically use the same handful of IPs for lots of unrelated traffic, thus getting the ESNI records in its local cache.

P2: VM hosting/Physical hosting/connectivity providers not granting PTR name assignment control to IP address customers will cause the same problems as for other protocols (such as mail) and should thus be under the same or greater pressure to add that service. However since the greatest ESNI benefit is with IPs shared among many unrelated domains, and the operators of such IPs typically have full IP space ownership or the leverage to get PTR records where normal customers cannnot, the effect on ESNI deployment should be limited.

P3: This may or may not make it easier to do spoofing attacks, as an MitM attacker can spoof DNS responses either for the A/AAAA records of a target query domain (pointing to servers entirely under the attackers control, including legitimate ESNI records for that server) OR spoof the ESNI records of a server address for which the attacker has obtained traffic intercept ability. However these can be secured with the same DNS countermeasures already applicable to those DNS domains.

P4: query domains wishing to remain available in massively blocking markets (the kind that would not hesitate to block an entire CDN to stop access to a single web page) may have more difficulty opting out of ESNI so such blocking filters can recognize them as a permitted service.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tlswg/draft-ietf-tls-esni/issues/139, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP5szSqBkYuhH39f6ZWOFhNfNZdY9dlks5vSXMCgaJpZM4bZlWX .

jb-wisemo commented 5 years ago

The PTR record problem is a real issue, but less so for the servers best placed to be anonymity pools.

There are other mechanisms of cause. The key part is making ESNI a server property rather than an origin property, bypassing most of the issues with configuring each of the non-controversial origin domains that are just used as cover traffic. Hard part is finding a hello content that doesn't stick out by using any values specific to using a controversial domain. Per origin ESNI would stick out as most origins would feel pressured to not actively join the pool, while having to do something difficult or even pricey to stay out of the pool would be less easy to encourage via generic mass pressure.

For the following approaches compare to a normal first TLS connection: Round 1: DNS query for A and DNS query for AAAA (happy eyeballs) Round 2: TCP SYN Round 3: TLS Handshake round 1

With the PTR based approach the parallelizing would go like this for first contact (later contacts benefit from DNS caching):

Round 1: DNS query for A and DNS query for AAAA (happy eyeballs) Round 2: TCP SYN, DNS query for PTR and DNS query for ESNI keys directly in i*.arpa Round 3: DNS query for ESNI keys at PTR name (perhaps cached at nearby resolver). This round is done only if A: no ESNI record (not even an empty one) returned in round 2 AND B: PTR name returned and contains the substring "esni." For example example x1234.cdn.esni.example.net or x1234esni.example.net would cause queries of _esni.example x1234.cdn.esni.example.net or _esni.x1234esni.example.net Round 4: TLS Handshake round 1

Rule 3 B would require ESNI client facing servers without ability to add arbitrary reverse records to at least be able to choose a PTR name of the magic form. Note that the magic form contains no underscore but will only cost an extra DNS lookup returning NXDOMAIN if a non-ESNI server has that name form by chance. The rule may require further refinement to deal with common provider practices.

One technique requiring less round trips than a full HTTP REST API would be to use a clever sequence of TLS extensions, signalling suites etc. to do everything in the TLS layer at the cost of extra TCP round trips compared to the current draft.

A hybrid approach could go like this (first contact):

Round 1: DNS query for A and DNS query for AAAA (happy eyeballs) Round 2: TCP SYN, DNS query for _esni.$reverseip.arpa

If _esni returned with "no" record: Round 3: TLS Hello with the signalling suite and normal SNI, server responds normally

If _esni returned with key: Round 3: TLS hello with ESNI as per draft-02

If _esni returned with "ask" record: Round 3: TCP hello with invalid right-size data in ESNI extension, server sends keys in encrypted ESNI response. Round 4: Send new hello with real ESNI extension inside encrypted stream, server forwards to origin server, once encrypted by origin server, client-side server drops the outer encryption and forwards remaining packets unchanged.

If no _esni record returned from DNS, server is actually ESNI: Round 3: TLS Hello with a signalling suite and no SNI or ESNI, server sends keys in encrypted ESNI response. In parallel a new TCP SYN. Round 4: Send new hello with real ESNI extension inside encrypted stream, server forwards to origin server, once encryption activated by origin server, client-side server drops the outer encryption and forwards remaining packets unchanged. (Second TCP connection may be used or closed)

If no _esni record returned from DNS, server is not ESNI and has a default certificate matching desired SNI: Round 3: TLS Hello with a signalling suite and no SNI or ESNI, server server responds normally using its default certificate, client uses the connection. In parallel a new TCP SYN. (Second TCP connection may be used or closed)

If no _esni record returned from DNS, server is not ESNI and has a default certificate not matching desired SNI: Round 3: TLS Hello with a signalling suite and no SNI or ESNI, server server responds normally using its default certificate, client aborts. In parallel a new TCP SYN. Round 4: TLS Hello with the signalling suite and normal SNI, server responds normally. In parallel the first session is closed.

If no _esni record returned from DNS, server is not ESNI and has no default certificate: Round 3: TLS Hello with a signalling suite and no SNI or ESNI, server aborts. In parallel a new TCP SYN. Round 4: TLS Hello with the signalling suite and normal SNI, server responds normally In parallel the first session is closed.

(Inclusion of the signalling suite after concluding its a non-esni server is to detect downgrade attacks)

Intent is that the 4 round cases will be less common than the 3 round cases. All subsequent contacts to same server IP (up to a cache period) will be simply: Round 1: TCP SYN Round 2: TLS Hello according to cached type and keys (with or without session resumption), Server responds as expected.

Neither approach requires special IP address certificates.

kazuho commented 5 years ago

A hybrid approach could go like this (first contact):

Round 1: DNS query for A and DNS query for AAAA (happy eyeballs) Round 2: TCP SYN, DNS query for _esni.$reverseip.arpa

If _esni returned with "no" record: Round 3: TLS Hello with the signalling suite and normal SNI, server responds normally

If _esni returned with key: Round 3: TLS hello with ESNI as per draft-02

I agree that this method does not introduce an additional round-trip for TCP. But it will for QUIC, and I do not think we'd want to have that overhead always for QUIC.

jb-wisemo commented 5 years ago

Since QUIC is a new undeployed protocol running on UDP instead of TCP, it can move its SNI-like operation into the encrypted part and (where applicable) use some kind of in-protocol session resumption to switch from client-facing server public keys to origin server public keys. QUIC can also design its own mechanism for providing an encrypted non-repeating origin server indication in the early client to server packets.

ekr commented 5 years ago

Given that QUIC is already using TLS 1.3, I'd prefer that it be able to share mechanisms.

chris-wood commented 4 years ago

I think this is orthogonal to the contents of the draft. Namely, servers could maintain this sort of IP<->key association on their own without any client changes. Thus, closing as is. Please re-open if you think otherwise and have a suggested change for the document!

jb-wisemo commented 4 years ago

Closing reason seems a complete misunderstanding. I have not checked how my old wording fits or does not fit the current draft.

Issue #139 is about handling multiple public servers (with different "client-facing" server IPs and different configuration), not about tracking client IPs.

chris-wood commented 4 years ago

The reasoning above isn’t about tracking client IPs. Please give the latest draft a review and re-open if you think it’s needed. And please provide suggested changes if they’re needed!