cfrg / draft-irtf-cfrg-opaque

The OPAQUE Asymmetric PAKE Protocol
https://cfrg.github.io/draft-irtf-cfrg-opaque/draft-irtf-cfrg-opaque.html
Other
97 stars 21 forks source link

Update OPRF to adapt to POPRF #281

Closed bytemare closed 2 years ago

bytemare commented 2 years ago

The (V)OPRF spec has been updated to include the POPRF findings. We should integrate these changes since this spec depends on it.

Update: POPRF merged with #282, but some questions remain

bytemare commented 2 years ago

I'm thinking of context binding using info = configuration || client_id || server_id, because that's all that's public and accessible to the client. But this needs to define a serialization of the configuration that we don't have at the moment

gtank commented 2 years ago

This adds an important security property for our (@celo-org's) OPAQUE use case. We can't strongly identify or authorize the users of our OPRF endpoint because it's part of an account recovery flow, so we rely on rate limiting client identities to prevent brute force attempts. As current specified OPAQUE seems to allow the following attack (which @nategraf and I spotted here and also in our current design with a different OPRF construction - it's a fundamental issue):

  1. Attacker submits a fraudulent request for the credential file of a target identity. They receive the encrypted blob but fail to decrypt it. This preserves the main security guarantees, since it prevents attacker access to the stored credential or a valid session key, but it still gives them a guess at the password's OPRF evaluation. We assume rate limiting stops giving OPRF outputs in response to their password guesses at some point, but the attacker can retain the encrypted credential file of the target account.

  2. Attacker then creates a sequence of new accounts and requests their own credential files, but continues guessing the target's password and applying the OPRF outputs to decryption attempts on the target file.

Basically, since the OPRF evaluation isn't domain-separated by identity, then the target's masking key can be derived from a successful guess by anyone. You could use different keys for different users (which is how some POPRFs work anyway) but a single-key POPRF would allow us to bind the client ID to the OPRF evaluation and enforce rate limits on a per-client basis without the complexity of additional keys.

nategraf commented 2 years ago

As additional context, here is a spec for a POPRF API with domain separators @celo-org is currently implementing. https://github.com/celo-org/celo-proposals/blob/master/CIPs/cip-0040.md

bytemare commented 2 years ago

@gtank I don't know if you saw it, but currently (without POPRF) the OPRF key is derived from a mix of a "record_identifier" that's unique to a client_identity, and a seed. This way, every client has its OPRF evaluated with a different key. It's not proper domain separation but still ensures keys are unique.

nategraf commented 2 years ago

@bytemare Speaking for myself, I did not notice that before. It does seem to address the primary concern, of an attacker being able to use sybil accounts to make additional online guessing attempts outside any client-specific rate limit. As a tradeoff, this seems to prevent verification of OPRF evaluation against any pre-shared (e.g. in the client binary or through PKI) server public key. It also prevents threshold implementations that rely on algebraically related keys (e.g. BLS signatures). If I understand the standard correctly, these are both trade-offs that are accepted here.

On partially related note, I noticed that the public portion of the OPRF key is not shared with the client in this standard. If I understand correctly, this means that none of the OPRF evaluations can be verified, even if the OPRF protocol supports it. This is as opposed to sharing it with the client as part of the OPRF evaluation response, and possibly including it in the message over which the envelope auth tag is computed to enable pinning of the OPRF key after registration. Is my understand correct, and is this intentional?

kevinlewi commented 2 years ago

@nategraf Yes that is a correct understanding and it is intentional. The original reason being that if we used a VOPRF, then the client needs to have on-hand their password as well as the VOPRF verification key to complete the key exchange phase, whereas if we just use an OPRF (as we do today), the client only needs their password to complete the key exchange phase. I see that in your application, you might be able to pre-share the VOPRF verification key in the client binary...

Edit: And as @bytemare pointed out in an offline conversation with me, note that these VOPRF verification keys are user-specific (different for each user, not quite like a single global server public key). Would it still make sense in your application to pre-share these per-client VOPRF verification keys that get established only after each client's own registration? Presumably this precludes it from being embedded in the client binary...

@bytemare I think we should consider altering the context binding for the POPRF to use a fixed string, or at least not include the client_identity, since the convention we have right now is that if no client_identity is supplied, then the client's public key is used. But this needs to be established before the server registration's evaluate function, whereas the client chooses their public key one step later.

chris-wood commented 2 years ago

Good points by @kevinlewi and @bytemare -- I forgot that the OPRF key is already domain separated by the (unique per user) credential identifier, so we shouldn't need to additionally include client_identifier in the POPRF metadata.

This brings us back to the drawing board. What, if anything, should go in the POPRF info string? We already have a slot for arbitrary application data (the AKE context string), so we shouldn't have another slot for arbitrary application data. (That seems super confusing to me.)

bytemare commented 2 years ago

Re-reading @nategraf's comment, I think the suggestion is to ship the VOPRF public key with KE2 (e.g. under the masked response). That would mean that the client derives a bunch of things in order to recover the plaintext, to then do the VOPRF verification.

Is my understanding of your suggestion correct?

bytemare commented 2 years ago

I suggest we keep this issue open to track discussions around context and domain separations strings in POPRF and AKE.

Let's discuss the actual value of the POPRF info in #283.

nategraf commented 2 years ago

Re-reading @nategraf's comment, I think the suggestion is to ship the VOPRF public key with KE2 (e.g. under the masked response). That would mean that the client derives a bunch of things in order to recover the plaintext, to then do the VOPRF verification.

Right, that is what I was thinking of: Shipping the OPRF public key to the user with KE2 and and including it in the data over which the auth tag is calculated to ensure it remains consistent after registration. Although I can see an issue with this now in that the auth tag is only available after unmasking the record in the response. Client behavior would only be different (e.g. surfacing a "verification error" instead of "incorrect password") in a few narrow cases:

  1. If the OPRF response is manipulated such that it no longer verifies against the public key added to KE2, the client will detect this instead of simply failing to derive the masking key. But an attacker would also have access to modify the included public key, so its unclear why they would bother doing this.
  2. If the OPRF public key is malleable, then an attacker may include a public key that would consistent with the evaluation, allowing the client to unmask the response and check the auth tag, at which point they would see a failure. Again its unclear what value this is to an attacker.

In particular, if the attacker replaces the evaluation and public key in KE2 with a distinct public key and associated evaluation, the client would simply fail to derive the masking key and be unable to detect that the public key is inconsistent. As a result, this is not so valuable.

Anyway. this is off topic for this issue. 😅

chris-wood commented 2 years ago

Overcome by #324.