h2o / picotls

TLS 1.3 implementation in C (master supports RFC8446 as well as draft-26, -27, -28)
536 stars 140 forks source link

external PSK mode #481

Closed kazuho closed 5 months ago

kazuho commented 1 year ago

Subsumes #321, https://github.com/sshock/picotls/pull/1.

ToDo:

sshock commented 1 year ago

As part of this work, could we add the ability to force PSK-only mode? I found the require_dhe_on_psk option, but that only presents two scenarios, not three:

  1. require_dhe_on_psk = 1:
    • client sends just psk_dhe_ke mode (and a key_share of course)
    • server chooses PSK-DHE of course and sends back a key_share
  2. require_dhe_on_psk = 0:
    • client sends both psk_dhe_ke and psk_ke mode and a key_share
    • server chooses psk_ke (PSK-only) and does not send back a key_share
  3. ??? <-- this isn't possible yet
    • client sends just psk_ke mode and no key_share
    • server chooses psk_ke (PSK-only) and does not send back a key_share

A couple reasons why this could be useful:

Thanks!

sshock commented 11 months ago

@kazuho I noticed the following behavior with external PSK authentication:

  1. When the PSK identity matches but the secret is wrong, the server fails right away with error 51 PTLS_ALERT_DECRYPT_ERROR. This is good.
  2. When the PSK identity doesn't match, the server continues the handshake, although the ServerHello lacks a selected pre_shared_key. Then the client fails with error 47 PTLS_ALERT_ILLEGAL_PARAMETER. Is this the expected behavior?

I'm concerned in the 2nd case that the failure is happening on the client. If the client tried to keep going would it fail on the server? I'm concerned because it appears from the server's perspective that the handshake completed since it did send a TLS Handshake Finished and there is a SERVER_TRAFFIC_SECRET_0.

Update Oct 13, 2023:

I noticed the server is entering "force a full handshake" mode, which makes sense to do when no PSK is selected, and it is sending an empty EE, an empty Certificate, Finished, and then a New Session Ticket.

I still think it is not good that the server reaches handshake complete. I think if it reaches a point where it is wanting to send a 0-length Certificate, it probably ought to have failed by then.

I tried several things to craft a client that would continue the handshake, including making it ignore the fact that the Certificate was empty and there was no CertificateVerify, and even modified it to not plug the PSK into the key schedule. But so far I have been unable to get past verify_finished and the SERVER_TRAFFIC_SECRET_0 doesn't match what the server comes up with.

But despite the fact that I was unable to achieve this today, given the fact that the server is definitely not plugging the PSK into the key schedule, I still believe it must be possible to craft a client that will come up with the same master secret.

Interestingly, I tested picotls against a wolfssl TLS server and it appears to have the same issue. It replied with a ServerHello with no psk selected, an empty EE, and TLS Finished. So it may have the same issue as picotls. Openssl server on the other hand behaved much better and failed immediately with error 40 handshake_failure.

I think we need to get to the bottom of this before allowing this PR to go in. Even if we thought we could prove that the handshake can't continue, we would want this to fail on the server right away. There is no valid use case for a TLS server to continue when no PSK has been selected and it has no certs. I do not believe TLS 1.3 permits making a connection with just a DHE key exchange and no authentication of any kind.

kazuho commented 10 months ago

@sshock

I noticed the server is entering "force a full handshake" mode, which makes sense to do when no PSK is selected, and it is sending an empty EE, an empty Certificate, Finished, and then a New Session Ticket.

I still think it is not good that the server reaches handshake complete. I think if it reaches a point where it is wanting to send a 0-length Certificate, it probably ought to have failed by then.

Thank you for bringing the issue to our attention, I think it depends.

RFC 8446 appendix C.5 states that applications MUST NOT use TLS in such a way absent explicit configuration or a specific application profile. To paraphrase, if a server can run anonymously depends on the intent of the user or the application, and the willingness of the TLS stack to support it.

Also, Section 6.2 explicitly states that servers may avoid sending unknown_psk_identity alert due to privacy concerns.

As a solution, how about doing something like below inside client_handle_hello?

if (tls->pre_shared_key.identity.base != NULL && !tls->is_psk_handshake) {
    ret =  PTLS_ALERT_HANDSHAKE_FAILURE;
    goto Exit;
}

Running as a PSK client, we know that the handshake cannot conclude when the server has rejected the PSK identity being offered.

sshock commented 10 months ago

@sshock

RFC 8446 appendix C.5 states that applications MUST NOT use TLS in such a way absent explicit configuration or a specific application profile. To paraphrase, if a server can run anonymously depends on the intent of the user or the application, and the willingness of the TLS stack to support it.

Interesting. I had not expected TLS to give any allowance for establishing an unauthenticated connection.

Although the scenarios described here seem to be related to situations where perhaps this might be the desired behavior, such as when using raw public keys. I don't think the use case of "I'm a server using PSK for authentication; but if the client doesn't have a matching PSK, I want to let them connect anyways with no authentication" is a desired behavior, especially not by default.

Also, Section 6.2 explicitly states that servers may avoid sending unknown_psk_identity alert due to privacy concerns.

Oh neat. I never even noticed the unknown_psk_identity error code before.

So this is what I see in section 6.2:

unknown_psk_identity: Sent by servers when PSK key establishment is desired but no acceptable PSK identity is provided by the client. Sending this alert is OPTIONAL; servers MAY instead choose to send a "decrypt_error" alert to merely indicate an invalid PSK identity.

I think that is saying a server can choose to send "decrypt_error" instead of "unknown_psk_identity", not that it can choose to keep going with the handshake.

As a solution, how about doing something like below inside client_handle_hello?

if (tls->pre_shared_key.identity.base != NULL && !tls->is_psk_handshake) {
    ret =  PTLS_ALERT_HANDSHAKE_FAILURE;
    goto Exit;
}

Yeah, I like this. Although my focus has been on the server side, I think this is a good fix on the client side to help make sure it will fail when doing PSK if the server does not select a PSK.

Running as a PSK client, we know that the handshake cannot conclude when the server has rejected the PSK identity being offered.

Sounds good. I think we want that fix.

Now what about the server side? My concern is that if we have set up a server to do PSK authentication (no certs), only a client with a matching PSK identity and secret should be able to connect. And right now the server does fail as it should when the secret is wrong, but it allows the handshake to succeed when the client provides a wrong psk identity. Instead of completing the handshake I think the server should fail with unknown_psk_identity or decrypt_error.

Similarly, the server needs to reject a connection if the client sends no pre_shared_key extension at all.

Unlike cert-based auth where the more common use case is to do just server auth and not client auth, with PSK one expects mutual auth (though I admit I couldn't find anything in the RFC mandating this). In any case, mutual auth needs to at least be possible if not the default. We need the handshake to fail on the server if the client did not provide a matching PSK.

(BTW, I tried setting the require_client_authentication flag to see if it would help anything, but it just broke everything as this flag is very specifically tied to cert-based client auth.)

kazuho commented 10 months ago

Thanks, I'm convinced. What do you think about #489?

sshock commented 10 months ago

Thanks, I'm convinced. What do you think about #489?

Cool, thanks!

Just one thing: did you mean to include the check for require_client_authentication? I don't mind having to specify that to get the desired behavior on the server, but if we do want to use that a few spots will need to be tweaked for it to not cause problems:

Maybe it would be better to have a new flag for this instead; then we could also invert the name and logic so the default behavior will be to require client authentication; e.g., call it psk_allow_unauthenticated_client.

kazuho commented 10 months ago

@sshock That's a good point. I did not take that path because I wasn't certain to have one flag change the behavior of both external PSK mode and X.509 mode.

For external PSK, the only reason for having a flag is to allow fallback to authentication using other methods. For external PSK to allow fallbacks, the unified flag will be set to false. That means that users cannot use certificate-based client authentication as a fallback.

Is that the desired behavior?

Maybe it is not. Hence I ended up with a PR that does not change how require_client_authentication works at all.

Regarding these fall backs, rather than adding a flag to allow mismatch of keys, maybe we should better extend the on_client_hello_cb callback to carry the PSK identity so that application can switch to a different ptls_context_t that does not have the external PSK set. Adding flags lead to complexity.

kazuho commented 10 months ago

Regarding these fall backs, rather than adding a flag to allow mismatch of keys, maybe we should better extend the on_client_hello_cb callback to carry the PSK identity so that application can switch to a different ptls_context_t that does not have the external PSK set. Adding flags lead to complexity.

Did this and merged #489 here.

sshock commented 10 months ago

Regarding these fall backs, rather than adding a flag to allow mismatch of keys, maybe we should better extend the on_client_hello_cb callback to carry the PSK identity so that application can switch to a different ptls_context_t that does not have the external PSK set. Adding flags lead to complexity.

Yeah, I agree that adding flags adds complexity. And enhancing the on_client_hello callback is more flexible, although it is a breaking API change, so that's unfortunate.

So it's ok to for the on_client_hello callback to change the ptls_context_t? That feels a little strange for one to be changing that in the middle of a handshake, but I guess it makes sense for this use case.

So an example use case here might be that in this callback one manually checks to see if there is at least one identity and it matches one of the psk identities on the server, and if not change the context to one that doesn't have external PSK to fallback to cert-based auth?

Did this and merged #489 here.

Thanks. Unfortunately, I see you still are still checking require_client_authentication on line 4601

    if (tls->ctx->pre_shared_key.identity.base != NULL && tls->ctx->require_client_authentication && psk_index == SIZE_MAX) {
        ret = PTLS_ALERT_UNKNOWN_PSK_IDENTITY;
        goto Exit;
    }

Did you mean to leave that in there? I think we want to take it out and change it to this now:

    if (tls->ctx->pre_shared_key.identity.base != NULL && psk_index == SIZE_MAX) {
        ret = PTLS_ALERT_UNKNOWN_PSK_IDENTITY;
        goto Exit;
    }

(You can't leave the require_client_authentication check on this line unless you want to also tweak all the other spots I mentioned. But I think you meant to take it out here, now that you've provided a different way to do fallback.)

Update: Good news. I did a quick test and was able to confirm that, after fixing line 4601, everything works as expected. An incorrect PSK from the client results in a 115 unknown_psk_identity from the server.

kazuho commented 10 months ago

@sshock Thanks, I missed that you were talking about me accidentally referring to require_client_authentication. I've addressed that in e220cda.

So it's ok to for the on_client_hello callback to change the ptls_context_t? That feels a little strange for one to be changing that in the middle of a handshake, but I guess it makes sense for this use case.

Correct. This is actually pretty common practice. This is how we switch between different certificates based on SNI, not only when using picotls but also with other TLS stacks that I know of.

sshock commented 10 months ago

@sshock Thanks, I missed that you were talking about me accidentally referring to require_client_authentication. I've addressed that in e220cda.

Perfect, thanks!

So it's ok to for the on_client_hello callback to change the ptls_context_t? That feels a little strange for one to be changing that in the middle of a handshake, but I guess it makes sense for this use case.

Correct. This is actually pretty common practice. This is how we switch between different certificates based on SNI, not only when using picotls but also with other TLS stacks that I know of.

Good to know!

sshock commented 6 months ago

Hi @kazuho. I'm seeing an unexpected behavior for no matching PSK identity in a scenario where there is no key share.

In my testing earlier with mismatching PSK identity, the server would fail with a 115 UNKNOWN_PSK_IDENTITY as expected, as we fixed it here to do.

However, in my latest testing I have no key share (ptls_context_t key_exchanges is set to NULL), and now instead of failing with 115 as expected, the server sends a HelloRetryRequest.

This is happening around line 4633 in picotls.c, right after the /* try external psk handshake */ section, where it does the HelloRetryRequest:

    /* send HelloRetryRequest if enforced by config or upon key-share mismatch, unless PSK has already been selected */
    if (!is_second_flight && psk_index == SIZE_MAX &&
        (key_share.algorithm == NULL || (properties != NULL && properties->server.enforce_retry))) {

I'm not sure what the right solution is, but it seems like maybe this block of code to fail with UNKNOWN_PSK_IDENTITY needs to be moved up (to right after the /* try external psk handshake */ section)?

    /* If the server was setup to use an external PSK but failed to agree, abort the handshake. Because external PSK is a form of
     * mutual authentication, we should abort continue the handshake upon negotiation failure, at least by default. */
    if (tls->ctx->pre_shared_key.identity.base != NULL && psk_index == SIZE_MAX) {
        ret = PTLS_ALERT_UNKNOWN_PSK_IDENTITY;
        goto Exit;
    }

I'm not confident though because that is currently placed after a /* try psk handshake */ section (is this second one for handling resumption psk?), so if we moved it up would it break resumption psk?

I'm hoping you can understand the proper way to fix this 🤞, thanks!

sshock commented 6 months ago

Since my primary concern is security, I can live with this inconsistent behavior as long as it doesn't present a security vulnerability. That is, as long as the 2nd ClientHello attempt will fail, both with or without a key share. I just don't want a client to be able to trick a server into allowing an anonymous connection.

If you can help me confirm this, that would make me feel a lot better.

Then we can take our time thinking about the proper way to fix it, which I need your help with.

Or if you believe this behavior is correct, please explain. The RFC says Hello Retry is when "it is able to find an acceptable set of parameters but the ClientHello does not contain sufficient information to proceed with the handshake." I just don't see how "no matching PSK identity" falls into this category, but only when key share is missing.

kazuho commented 5 months ago

@sshock

I'm seeing an unexpected behavior for no matching PSK identity in a scenario where there is no key share.

In my testing earlier with mismatching PSK identity, the server would fail with a 115 UNKNOWN_PSK_IDENTITY as expected, as we fixed it here to do.

However, in my latest testing I have no key share (ptls_context_t key_exchanges is set to NULL), and now instead of failing with 115 as expected, the server sends a HelloRetryRequest.

Thank you for the patience.

I've made some changes reducing differences from master so that this PR can be merged. I think I've fixed the bug alongside that. We also have tests that involve PSK + HRR.

sshock commented 5 months ago

I've made some changes reducing differences from master so that this PR can be merged. I think I've fixed the bug alongside that. We also have tests that involve PSK + HRR.

Thanks @kazuho ! I was able to confirm that it now fails properly with 115 on wrong psk identity, even with no key share.

sshock commented 5 months ago

I also noticed that cli.c now helps set the psk_hash (a522cd9), thanks!