cfrg / draft-irtf-cfrg-opaque

The OPAQUE Asymmetric PAKE Protocol
https://cfrg.github.io/draft-irtf-cfrg-opaque/draft-irtf-cfrg-opaque.html
Other
97 stars 21 forks source link

POPRF info value #283

Closed bytemare closed 2 years ago

bytemare commented 2 years ago

Currently, domain separation on OPRF evaluation is done using the client's record identifier and a global seed to derive a user-specific evaluation key.

(V)OPRF introduces POPRF with metadata that integrates domain separation in the API. This API allows for an info value for the server's evaluation function and an info value for the client's finalization function. These two values can be different from one another and the protocol will still execute correctly, but these values must be constant across sessions to yield the same result. This last condition requires the client to possess client-specific discriminating public information that would allow proper domain separation. The client_identity is not currently not required to exist for the client, and can therefore be empty, which makes it a non-reliable candidate for the protocol.

What should these values be?

nategraf commented 2 years ago

If I understand correctly, the public input is only needed by the server during POPRF evaluation and the client during verification. Because, as specified, the client does not have access to the (P)OPRF verification key, the client can't verify the output and so doesn't actually need the public input. Is that a fair claim? (I don't fully understand how, from a client perspective, using a POPRF without verification is different than using an OPRF with domain separated keys, as the current draft does, so I imagine I am missing something)

bytemare commented 2 years ago

Hi @nategraf, Yes indeed, we don't use a verifiable OPRF, as we have use the envelope MAC to verify that the output is correct.

POPRF, compared to the earlier OPRF, allows public metadata to be used for domain separation, which is a desirable feature when serving multiple different clients. Until now, we did domain separation by mixing the oprf_seed with the credential_identifier to derive the client specific OPRF evaluation key ku. If this is something that does not come clear in the document, I'd be very interested in your feedback :)

bytemare commented 2 years ago

Hi @hugokraw, with @kevinlewi, we are wondering whether the current domain separation ( i.e. ku = hash-to-scalar(HKDF-Expand(oprf_seed, credential_identifier))) has different security properties than the domain-separation offered by the new POPRF (i.e. use the info field for domain separation input), and if there would be any benefit in switching to the later. Do you have an opinion on that ?

bytemare commented 2 years ago

It looks like we agree to set the POPRF's info field to an app independent fixed string

nategraf commented 2 years ago

@bytemare my feedback is just that it is unclear why this specification, with the introduction of a POPRF, prefers to implement domain separation by deriving a domain-specific key rather than using a fixed key and implementing domain separation using the info field of the POPRF.

Concretely, this disallows an implementation of OPAQUE from using an MPC POPRF construction built on algebraically related keys (e.g. constructions based on threshold BLS signatures).

(It also disallows any extension to support verification via pre-shared public keys, although you've made it clear that this proposal has decided this is a non-goal)

kevinlewi commented 2 years ago

@nategraf The introduction of the deterministic derivation of the key from the oprf_seed parameter was not primarily for domain separation, but to address the "client enumeration attacks", where information is leaked during user re-registration. See #210, #215, for some older discussion on how we arrived to the current state today, as well as the section here (https://github.com/cfrg/draft-irtf-cfrg-opaque/blob/master/draft-irtf-cfrg-opaque.md#client-enumeration-preventing-client-enumeration).

Moving the credential_identifier parameter into the info field of the POPRF would achieve domain separation, but I don't believe it would be sufficient to address the client enumeration attacks, since that seems to require deterministic key derivation from oprf_seed.

Unfortunately, as you pointed out, this tradeoff does preclude constructing the individual user OPRF keys in an algebraic manner.

nategraf commented 2 years ago

@kevinlewi reading into those issues and digging around a bit I found your PR (https://github.com/cfrg/draft-irtf-cfrg-opaque/pull/156) where you changed the protocol to use the current deterministic key generation method in replacement of generating the keys randomly at registration or password change and persisting the generated key. Reading the previous draft, my understanding is the client enumeration risk was associated with A) changing from the default OPRF key to a fresh randomly generated one upon registration and B) changing the OPRF key when the client changes their password.

In general, the output of the OPRF depended on the blinded input (request.data) and a randomly generated OPRF key (oprf_key), which could change depending on the actions of the honest client (e.g. on registration). With the introduction of the deterministic method the output depends on the blinded input (request.data), client_identifier, and oprf_seed, none of which depend on the actions of the honest client. Is my understanding correct?

If my understanding is correct, I think setting the info field to client_identifier and using a fixed POPRF key (oprf_seed) would have the same properties. In particular, the output would depend upon the blinded input (request.data), client_identifier and oprf_seed. Do you think this is accurate?

Unfortunately, as you pointed out, this tradeoff does preclude constructing the individual user OPRF keys in an algebraic manner.

Sorry, I wasn't very clear before. I was actually referring to threshold (MPC) based solutions such as the OPRFs derived from BLS threshold signatures which use a Shamir-like DKG process to establish their keys. (Pythia's POPRF construction also may be amenable to thesholdization using the same DKG process, and that is what our team is working on now). Essentially, because this generates the (P)OPRF keys dynamically, it prevents the use of protocols since they require a (D)KG process to be run ahead-of-time. IIUC, generating the keys using a KDF applied to the client_identifer and oprf_seed as is done now also disables the use of the construction from "Threshold Partially-Oblivious PRFs with Applications to Key Management" which also requires an ahead-of-time key setup process. Obliviously @hugokraw would be the authority on that though.

Thanks for taking the time to answer all my comments so far. I've definitly found it interesting and hopefully some of these comments turn out to be useful.

kevinlewi commented 2 years ago

In general, the output of the OPRF depended on the blinded input (request.data) and a randomly generated OPRF key (oprf_key), which could change depending on the actions of the honest client (e.g. on registration). With the introduction of the deterministic method the output depends on the blinded input (request.data), client_identifier, and oprf_seed, none of which depend on the actions of the honest client. Is my understanding correct?

That's correct.

If my understanding is correct, I think setting the info field to client_identifier and using a fixed POPRF key (oprf_seed) would have the same properties. In particular, the output would depend upon the blinded input (request.data), client_identifier and oprf_seed. Do you think this is accurate?

Good point! After some more thought on this, I believe you are correct. In some senses it might be "cleaner" to incorporate credential_identifier in the way that you are suggesting. (Btw it is credential_identifier and not client_identity that is set this way, I am assuming that that's what you meant by client_identifier). I think we should double-check with @hugokraw on the security of this, but I would be in favor of making this change if it retains the same security guarantees. cc: @bytemare , @chris-wood

Sorry, I wasn't very clear before. I was actually referring to threshold (MPC) based solutions such as the OPRFs derived from BLS threshold signatures which use a Shamir-like DKG process to establish their keys. (Pythia's POPRF construction also may be amenable to thesholdization using the same DKG process, and that is what our team is working on now). Essentially, because this generates the (P)OPRF keys dynamically, it prevents the use of protocols since they require a (D)KG process to be run ahead-of-time. IIUC, generating the keys using a KDF applied to the client_identifer and oprf_seed as is done now also disables the use of the construction from "Threshold Partially-Oblivious PRFs with Applications to Key Management" which also requires an ahead-of-time key setup process. Obliviously @hugokraw would be the authority on that though.

I see, thanks for the clarification.

hugokraw commented 2 years ago

I am not happy with the move to POPRF (or should I say that I am against it?). POPRF does not have a proof that it satisfies the UC OPRF functionality which is the basis for the proof of OPAQUE in the UC. I am not saying POPRF cannot be proven secure in that sense (I did not try) but these things are never trivial. So at this point the claim of provability of OPAQUE would be unresolved. Even if someone would adjust the proof to work with the current security definition of POPRF, it would not be in the UC which was one of the "selling points" in the CFRG process. (By the way, does the proof in the POPRF paper includes the non-verifiable version?). In addition, the assumptions for this POPRF are stronger and less standard than for 2HashDH. In general, the latter is also conceptually simpler which is also an advantage (though not a decisive point). POPRF is also problematic regarding key rotation (as pointed out by the POPRF paper itself), something that applies to the use in OPAQUE. Finally, regarding threshold implementation, 2HashDH has a trivial implementation, non-interactive and proactivizable. The latter element is lost when deriving individual user keys via a PRF but this is not a must in general and even in this case one gets very efficient threshold schemes when the number of servers is not very large (as in most practical cases). I hope we can revert the decision. I see that a new VOPRF draft posted today only defines this POPRF. I am surprised at that decision, particularly that no such discussion took place in the WG. This would seem to force OPAQUE to use POPRF which, as said, I do not recommend.

chris-wood commented 2 years ago

I am not happy with the move to POPRF (or should I say that I am against it?).

I think the latter =) This is a fine position to take, though I don't think it's a concern for the OPAQUE specification specifically because the protocol can accommodate any OPRF. I'll reply to specific points below.

POPRF does not have a proof that it satisfies the UC OPRF functionality which is the basis for the proof of OPAQUE in the UC. I am not saying POPRF cannot be proven secure in that sense (I did not try) but these things are never trivial. So at this point the claim of provability of OPAQUE would be unresolved. Even if someone would adjust the proof to work with the current security definition of POPRF, it would not be in the UC which was one of the "selling points" in the CFRG process.

It's true that there's no proof that the POPRF satisfies the UC OPRF functionality, but as you say, this is something that can be done. I think we (CFRG) should do that work. We're doing similar analyses for other CFRG drafts, so this is not unprecedented.

(By the way, does the proof in the POPRF paper includes the non-verifiable version?)

Indeed!

In addition, the assumptions for this POPRF are stronger and less standard than for 2HashDH. In general, the latter is also conceptually simpler which is also an advantage (though not a decisive point).

This is true, though the authors have confidence in the assumptions and the reduction to ECDL. Though it's a new assumption, so our mileage may vary.

POPRF is also problematic regarding key rotation (as pointed out by the POPRF paper itself), something that applies to the use in OPAQUE.

I think this may be a misunderstanding. Key management is identical in the OPRF and POPRF protocols.

Finally, regarding threshold implementation, 2HashDH has a trivial implementation, non-interactive and proactivizable. The latter element is lost when deriving individual user keys via a PRF but this is not a must in general and even in this case one gets very efficient threshold schemes when the number of servers is not very large (as in most practical cases).

This seems to be the most notable regression. However, I don't consider it to be a problem, for two reasons:

1) Threshold variants of the OPRF are out of scope for the draft-irtf-cfrg-voprf, so although we don't have an easy way to threshold the POPRF, I don't think we've lost any functionality. 2) OPAQUE can be configured to use any OPRF, including ones that are amenable to threshold implementations.

It's certainly possible that we could add back the original 2HashDH to draft-irtf-cfrg-voprf, but that's a question for that document and less so for OPAQUE.

I see that a new VOPRF draft posted today only defines this POPRF. I am surprised at that decision, particularly that no such discussion took place in the WG. This would seem to force OPAQUE to use POPRF which, as said, I do not recommend.

This proposal was presented without objection during the last IETF meeting.

hugokraw commented 2 years ago

The bottom line is this: If POPRF does not have an advantage for the OPAQUE protocol (does it?), then we should keep 2HashDH as the default one. It is simpler and threshold friendly (and while POPRF has some advantage regarding verifiability, this is not needed in OPRF). As for the general VOPRF draft, I strongly recommend it includes a full specification of 2HashDH. Not only for OPAQUE but for many other applications for which 2HashDH is a better option (simplicity, threshold, assumptions). I also recommend that the VOPRF document includes POPRF for applications that require support for metadata values and cannot afford a per-value public verification key. I hope you can agree with this...

Btw, even if threshold is out of scope for the voprf draft, it is a basis for more advanced constructions, and threshold is an important extension so supporting 2HashDH for that is important.

The one thing I am fully opposed to is to outsource the definition of the OPRF in OPAQUE to the VOPRF document and then only have POPRF in that document.

chris-wood commented 2 years ago

The bottom line is this: If POPRF does not have an advantage for the OPAQUE protocol (does it?), then we should keep 2HashDH as the default one.

I'm not sure I agree with this, because...

The one thing I am fully opposed to is to outsource the definition of the OPRF in OPAQUE to the VOPRF document and then only have POPRF in that document.

OPAQUE depends on the VOPRF specification. We shouldn't be defining an OPRF just for the purposes of OPAQUE. The purpose of draft-irtf-cfrg-opaque is to be maximally useful for all use cases, of which OPAQUE is one. So that means maybe adding the 2HashDH OPRF back to that document, all for the purposes of aiding threshold deployments. Right now, that doesn't seem like sufficient justification to have two constructions with fundamentally different properties under the same abstraction. We could add back 2HashDH under a different abstraction in draft-irtf-cfrg-voprf. That would just widen the scope of that document, though, which isn't harmful.

All that said, I think the simplest and most pragmatic thing to do is just keep the POPRF. But I think a reasonable compromise is to support both OPRF and POPRF in draft-irtf-cfrg-voprf, under separate abstractions. We can discuss this during IETF 112.

Btw, even if threshold is out of scope for the voprf draft, it is a basis for more advanced constructions, and threshold is an important extension so supporting 2HashDH for that is important.

No disagreement there! I'm simply noting that threshold implementations were not in scope for that specification. We can always change that though. =)

hugokraw commented 2 years ago

I'd say that if you end up driving people to implement OPAQUE with POPRF (*) you would have made a disservice to these implementers. It has no benefit over 2HashDH for the practice of OPAQUE and a serious barrier for threshold implementations ("not in scope" now, but I hope we will see more of them in the future).

(*) This is exactly what you would be doing if you defined the VOPRF with POPRF only, and define OPAQUE only with hooks to VOPRF.

chris-wood commented 2 years ago

Well, we've written the OPAQUE spec such that any OPRF can be used, so there is no restriction to POPRF. For example, if someone wants to use a future PQ OPRF, they can do so with no substantial OPAQUE changes.

hugokraw commented 2 years ago

I agree that in the future people can choose to use other OPRFs, but right now, if you define the OPRF with VOPRF-draft hooks they will use whatever is defined there. And if what is defines there is only POPRF, that's what they will use. I do not recommend it.

chris-wood commented 2 years ago

I agree that in the future people can choose to use other OPRFs, but right now, if you define the OPRF with VOPRF-draft hooks they will use whatever is defined there.

The OPRF dependency isn't defined this way -- it's meant to accommodate any suitable OPRF, specifically so one can choose to use an OPRF that suits their needs.

hugokraw commented 2 years ago

Where are implementers supposed to get the specification of 2HashDH if they wanted to use it given its benefits if it is not part of the VOPRF spec?

chris-wood commented 2 years ago

Where are implementers supposed to get the specification of 2HashDH if they wanted to use it given its benefits if it is not part of the VOPRF spec?

Probably in the same spec which describes how to thresholdize 2HashDH (which doesn't exist). We could write one, though, or add it to the existing doc.

hugokraw commented 2 years ago

I vote for the latter: Add it to the existing doc, namely, the voprf draft. Writing a new one would postpone it unnecessarily and, frankly, in my opinion, it makes no sense to only have POPRF in a basic OPRF document. Btw, threshold is a dimension where 2HashDH is better than POPRF but I do not see why one would use POPRF in any setting where 2HashDH suffices.

Btw, one can interpret my insistence as pushing "my own stuff". I hope you understand this is a sincere opinion based on technical stuff, particularly as this is NOT my own stuff. It was invented by Chaum 30 years ago. Also, I think the POPRF work is excellent and I am happy to have a partial OPRF which is non-pairing based, and happy that the voprf draft will define it for those that need a partial OPRF. I just don't think that POPRF needs to be the one size that fits all, in prticular, not the best fit with OPRF.

chris-wood commented 2 years ago

A POPRF with a fixed info string is functionally an OPRF (ignoring threshold implementations), so it is redundant to add both to the same doc. The question I think we're asking here is whether we need to specify a standard version of a threshold OPRF. (As of now, I don't think the benefits of a threshold variant warrant inclusion in the existing doc.)

hugokraw commented 2 years ago

Functionally speaking, at least in its basic form, POPRF is functionally equivalent to 2HashDH, but formally speaking, we do not have a proof of OPAQUE under the security definition behind POPRF as this proof relies on a UC object - which is also how the very notion of aPAKE is formalized in the analysis of OPAQUE in JKX18. If draft-voprf will have the hooks needed to implement 2HashDH then I see absolutely no reason not to include that instantiation in the document (while I see significant reason to include it).

chris-wood commented 2 years ago

but formally speaking, we do not have a proof of OPAQUE under the security definition behind POPRF as this proof relies on a UC object

Right, and this is a gap I think we can overcome, as we're doing similar analyses for other CFRG documents.

If draft-voprf will have the hooks needed to implement 2HashDH then I see absolutely no reason not to include that instantiation in the document (while I see significant reason to include it).

I think it complicates the implementation story to have both in the same document, under the same API (or syntax). 2HashDH doesn't support metadata, so would one use the same syntax to describe 2HashDH as one would to describe 3HashSDHI?

It would seem reasonable to me to instead keep these syntaxes separate, and then have instantiations for both in draft-irtf-cfrg-voprf. One syntax for POPRFs implemented based on 3HashSDHI, and another for OPRFs implemented with 2HashDH.

chris-wood commented 2 years ago

Overcome by #324.