Server retry flow, section 7.1

sayrer commented 7 months ago

I understand why the last part of this section is there.

Otherwise, if all candidate ECHConfig values fail to decrypt the extension, the client-facing server MUST ignore the extension and proceed with the connection using ClientHelloOuter, ...

Is this part really a MUST, though? Maybe if the server wishes to allow retries...? I noticed that you can get the same effect by sending no retry_configs and refusing non-ECH connections, just with double the connection attempts. So why not allow ech_failure errors? Apologies if this part was already discussed, I might have missed it.

davidben commented 7 months ago

This is really a MUST. The entire design of both the recovery flow and GREASE is predicated on this.

The recovery flow requires the server handshake with ClientHelloOuter in order to authenticate the retry signal. As a client, we would not be willing to deploy ECH if servers were not required to send a retry signal because it's far too much of a footgun and the end result is users cannot visit sites.

Even, if we were to make the retry signal optional, this is necessary for GREASE to work. Consider a server transitioning from not deploying ECH to deploying ECH. Because DNS fundamentally may get out of sync with the server, there will be a period of time when the serving endpoint implements ECH, but the client does not realize it. (Cached DNS records, etc.) Such a client will then send ECH GREASE. To the server, ECH GREASE looks like a payload that cannot decrypt with any available ECH key. For such connections to work, the server must handshake with ClientHelloOuter.

davidben commented 7 months ago

Note this transition may also last indefinitely. It could well be that the server deploys ECH, has published ECH in the DNS, but for whatever reason, the client does not have those records. The client could be using a non-DoH DNS resolver where the records don't get through. The record may just fail to get through because DNS is flake. Or the client's local policy may have ECH disabled.

Even if that client cannot take advantage of ECH, it may still send ECH GREASE to help prevent network ossification. But, again, that only works if servers all correctly handshake with ClientHelloOuter.

sayrer commented 7 months ago

I believe I understand all of your points, and I agree with most of them, I think. My point is that you can still comply with the current draft, and have this flow:

1) Fail to decrypt ECH, whether it's because of an invalid or stale ECHConfig, or GREASE 2) Respond with no retry_configs 3) Refuse any ClientHello messages without ECH ("SHOULD retry the handshake with a new transport connection and ECH disabled")

One could also send a HelloRetryRequest with the wrong number of bytes in the encrypted_client_hello extension. So, there are a couple of ways to do this already. What I'm asking about here is whether there should be a way to signal "you need an up-to-date ECHConfig, and the server is not going to supply one."

davidben commented 7 months ago

What I'm asking about here is whether there should be a way to signal "you need an up-to-date ECHConfig, and the server is not going to supply one."

There should not. The current protocol is that the server should supply one. This is important for ECH to be deployable.

sayrer commented 7 months ago

What I'm asking about here is whether there should be a way to signal "you need an up-to-date ECHConfig, and the server is not going to supply one."

There should not. The current protocol is that the server should supply one. This is important for ECH to be deployable.

Ah, but that is not the current protocol by my read. See the text If the server is configured with any ECHConfig and If the server provided "retry_configs".... That means the server can be configured without them, or choose not to supply them, I think.

I agree that most servers will want to supply these retry_configs.

davidben commented 7 months ago

"If the server is configured with any ECHConfigs" refers to whether the server is capable of decrypting any encrypted ClientHellos at all. I.e. if it is not configured with any ECHConfigs, it does not have ECH enabled.

'If the server provided "retry_configs"' is text for the client. The client needs to accommodate servers that do not implement this protocol. In particular, it's possible to offer ECH and find the server doesn't enable ECH if, e.g., the server had to rollback support for ECH in an emergency. In order for ECH to be safe for a server operator to enable, it must be safe to rollback in case something goes wrong with the deployment.

The rule is not that most servers will want to supply retry configs. The rule is that all servers that intend to enable ECH must supply retry configs. If they do not, they have not correctly enabled ECH.

sayrer commented 7 months ago

The rule is not that most servers will want to supply retry configs. The rule is that all servers that intend to enable ECH must supply retry configs. If they do not, they have not correctly enabled ECH.

Well, I didn't read it that way at all. But you can still get this same behavior by supplying something in retry configs you know will not work.

If none of the values provided in "retry_configs" contains a supported version, or an earlier TLS version was negotiated, the client can regard ECH as securely disabled by the server

I don't think this part is right. The server could send an ECHConfig with an "unsupported mandatory extension", for example.

davidben commented 7 months ago

I suppose that depends on what how you read "can regard as". If a client sees an ECHConfigList in DNS with only incompatible ECHConfigs, it will not offer ECH. Likewise, if the client sees it in retry configs, it should retry without offering ECH. This is to repair DNS/endpoint mismatches, so it should act as if it saw that in DNS.

Anyway, I'd suggest that if some of the wording isn't clear to you, put together a PR? Hopefully the intent of the spec is clearer now.

sayrer commented 7 months ago

I can understand your point of view here. But even accepting everything you've written, you can still get the effect I described in the first post. It just costs two connections vs an error right away. I guess you could also send handshake_failure on the first attempt.

ekr commented 4 months ago

Absent more support, I plan to close this on 2/24.

tlswg / draft-ietf-tls-esni

Server retry flow, section 7.1 #586