Closed bytemare closed 2 years ago
If I understand correctly, the public input is only needed by the server during POPRF evaluation and the client during verification. Because, as specified, the client does not have access to the (P)OPRF verification key, the client can't verify the output and so doesn't actually need the public input. Is that a fair claim? (I don't fully understand how, from a client perspective, using a POPRF without verification is different than using an OPRF with domain separated keys, as the current draft does, so I imagine I am missing something)
Hi @nategraf, Yes indeed, we don't use a verifiable OPRF, as we have use the envelope MAC to verify that the output is correct.
POPRF, compared to the earlier OPRF, allows public metadata to be used for domain separation, which is a desirable feature when serving multiple different clients. Until now, we did domain separation by mixing the oprf_seed
with the credential_identifier
to derive the client specific OPRF evaluation key ku
. If this is something that does not come clear in the document, I'd be very interested in your feedback :)
Hi @hugokraw,
with @kevinlewi, we are wondering whether the current domain separation ( i.e. ku = hash-to-scalar(HKDF-Expand(oprf_seed, credential_identifier))
) has different security properties than the domain-separation offered by the new POPRF (i.e. use the info
field for domain separation input), and if there would be any benefit in switching to the later. Do you have an opinion on that ?
It looks like we agree to set the POPRF's info
field to an app independent fixed string
@bytemare my feedback is just that it is unclear why this specification, with the introduction of a POPRF, prefers to implement domain separation by deriving a domain-specific key rather than using a fixed key and implementing domain separation using the info
field of the POPRF.
Concretely, this disallows an implementation of OPAQUE from using an MPC POPRF construction built on algebraically related keys (e.g. constructions based on threshold BLS signatures).
(It also disallows any extension to support verification via pre-shared public keys, although you've made it clear that this proposal has decided this is a non-goal)
@nategraf The introduction of the deterministic derivation of the key from the oprf_seed
parameter was not primarily for domain separation, but to address the "client enumeration attacks", where information is leaked during user re-registration. See #210, #215, for some older discussion on how we arrived to the current state today, as well as the section here (https://github.com/cfrg/draft-irtf-cfrg-opaque/blob/master/draft-irtf-cfrg-opaque.md#client-enumeration-preventing-client-enumeration).
Moving the credential_identifier
parameter into the info
field of the POPRF would achieve domain separation, but I don't believe it would be sufficient to address the client enumeration attacks, since that seems to require deterministic key derivation from oprf_seed
.
Unfortunately, as you pointed out, this tradeoff does preclude constructing the individual user OPRF keys in an algebraic manner.
@kevinlewi reading into those issues and digging around a bit I found your PR (https://github.com/cfrg/draft-irtf-cfrg-opaque/pull/156) where you changed the protocol to use the current deterministic key generation method in replacement of generating the keys randomly at registration or password change and persisting the generated key. Reading the previous draft, my understanding is the client enumeration risk was associated with A) changing from the default OPRF key to a fresh randomly generated one upon registration and B) changing the OPRF key when the client changes their password.
In general, the output of the OPRF depended on the blinded input (request.data
) and a randomly generated OPRF key (oprf_key
), which could change depending on the actions of the honest client (e.g. on registration). With the introduction of the deterministic method the output depends on the blinded input (request.data
), client_identifier
, and oprf_seed
, none of which depend on the actions of the honest client. Is my understanding correct?
If my understanding is correct, I think setting the info
field to client_identifier
and using a fixed POPRF key (oprf_seed
) would have the same properties. In particular, the output would depend upon the blinded input (request.data
), client_identifier
and oprf_seed
. Do you think this is accurate?
Unfortunately, as you pointed out, this tradeoff does preclude constructing the individual user OPRF keys in an algebraic manner.
Sorry, I wasn't very clear before. I was actually referring to threshold (MPC) based solutions such as the OPRFs derived from BLS threshold signatures which use a Shamir-like DKG process to establish their keys. (Pythia's POPRF construction also may be amenable to thesholdization using the same DKG process, and that is what our team is working on now). Essentially, because this generates the (P)OPRF keys dynamically, it prevents the use of protocols since they require a (D)KG process to be run ahead-of-time. IIUC, generating the keys using a KDF applied to the client_identifer
and oprf_seed
as is done now also disables the use of the construction from "Threshold Partially-Oblivious PRFs with Applications to Key Management" which also requires an ahead-of-time key setup process. Obliviously @hugokraw would be the authority on that though.
Thanks for taking the time to answer all my comments so far. I've definitly found it interesting and hopefully some of these comments turn out to be useful.
In general, the output of the OPRF depended on the blinded input (request.data) and a randomly generated OPRF key (oprf_key), which could change depending on the actions of the honest client (e.g. on registration). With the introduction of the deterministic method the output depends on the blinded input (request.data), client_identifier, and oprf_seed, none of which depend on the actions of the honest client. Is my understanding correct?
That's correct.
If my understanding is correct, I think setting the info field to client_identifier and using a fixed POPRF key (oprf_seed) would have the same properties. In particular, the output would depend upon the blinded input (request.data), client_identifier and oprf_seed. Do you think this is accurate?
Good point! After some more thought on this, I believe you are correct. In some senses it might be "cleaner" to incorporate credential_identifier
in the way that you are suggesting. (Btw it is credential_identifier
and not client_identity
that is set this way, I am assuming that that's what you meant by client_identifier
). I think we should double-check with @hugokraw on the security of this, but I would be in favor of making this change if it retains the same security guarantees. cc: @bytemare , @chris-wood
Sorry, I wasn't very clear before. I was actually referring to threshold (MPC) based solutions such as the OPRFs derived from BLS threshold signatures which use a Shamir-like DKG process to establish their keys. (Pythia's POPRF construction also may be amenable to thesholdization using the same DKG process, and that is what our team is working on now). Essentially, because this generates the (P)OPRF keys dynamically, it prevents the use of protocols since they require a (D)KG process to be run ahead-of-time. IIUC, generating the keys using a KDF applied to the client_identifer and oprf_seed as is done now also disables the use of the construction from "Threshold Partially-Oblivious PRFs with Applications to Key Management" which also requires an ahead-of-time key setup process. Obliviously @hugokraw would be the authority on that though.
I see, thanks for the clarification.
I am not happy with the move to POPRF (or should I say that I am against it?). POPRF does not have a proof that it satisfies the UC OPRF functionality which is the basis for the proof of OPAQUE in the UC. I am not saying POPRF cannot be proven secure in that sense (I did not try) but these things are never trivial. So at this point the claim of provability of OPAQUE would be unresolved. Even if someone would adjust the proof to work with the current security definition of POPRF, it would not be in the UC which was one of the "selling points" in the CFRG process. (By the way, does the proof in the POPRF paper includes the non-verifiable version?). In addition, the assumptions for this POPRF are stronger and less standard than for 2HashDH. In general, the latter is also conceptually simpler which is also an advantage (though not a decisive point). POPRF is also problematic regarding key rotation (as pointed out by the POPRF paper itself), something that applies to the use in OPAQUE. Finally, regarding threshold implementation, 2HashDH has a trivial implementation, non-interactive and proactivizable. The latter element is lost when deriving individual user keys via a PRF but this is not a must in general and even in this case one gets very efficient threshold schemes when the number of servers is not very large (as in most practical cases). I hope we can revert the decision. I see that a new VOPRF draft posted today only defines this POPRF. I am surprised at that decision, particularly that no such discussion took place in the WG. This would seem to force OPAQUE to use POPRF which, as said, I do not recommend.
I am not happy with the move to POPRF (or should I say that I am against it?).
I think the latter =) This is a fine position to take, though I don't think it's a concern for the OPAQUE specification specifically because the protocol can accommodate any OPRF. I'll reply to specific points below.
POPRF does not have a proof that it satisfies the UC OPRF functionality which is the basis for the proof of OPAQUE in the UC. I am not saying POPRF cannot be proven secure in that sense (I did not try) but these things are never trivial. So at this point the claim of provability of OPAQUE would be unresolved. Even if someone would adjust the proof to work with the current security definition of POPRF, it would not be in the UC which was one of the "selling points" in the CFRG process.
It's true that there's no proof that the POPRF satisfies the UC OPRF functionality, but as you say, this is something that can be done. I think we (CFRG) should do that work. We're doing similar analyses for other CFRG drafts, so this is not unprecedented.
(By the way, does the proof in the POPRF paper includes the non-verifiable version?)
Indeed!
In addition, the assumptions for this POPRF are stronger and less standard than for 2HashDH. In general, the latter is also conceptually simpler which is also an advantage (though not a decisive point).
This is true, though the authors have confidence in the assumptions and the reduction to ECDL. Though it's a new assumption, so our mileage may vary.
POPRF is also problematic regarding key rotation (as pointed out by the POPRF paper itself), something that applies to the use in OPAQUE.
I think this may be a misunderstanding. Key management is identical in the OPRF and POPRF protocols.
Finally, regarding threshold implementation, 2HashDH has a trivial implementation, non-interactive and proactivizable. The latter element is lost when deriving individual user keys via a PRF but this is not a must in general and even in this case one gets very efficient threshold schemes when the number of servers is not very large (as in most practical cases).
This seems to be the most notable regression. However, I don't consider it to be a problem, for two reasons:
1) Threshold variants of the OPRF are out of scope for the draft-irtf-cfrg-voprf, so although we don't have an easy way to threshold the POPRF, I don't think we've lost any functionality. 2) OPAQUE can be configured to use any OPRF, including ones that are amenable to threshold implementations.
It's certainly possible that we could add back the original 2HashDH to draft-irtf-cfrg-voprf, but that's a question for that document and less so for OPAQUE.
I see that a new VOPRF draft posted today only defines this POPRF. I am surprised at that decision, particularly that no such discussion took place in the WG. This would seem to force OPAQUE to use POPRF which, as said, I do not recommend.
This proposal was presented without objection during the last IETF meeting.
The bottom line is this: If POPRF does not have an advantage for the OPAQUE protocol (does it?), then we should keep 2HashDH as the default one. It is simpler and threshold friendly (and while POPRF has some advantage regarding verifiability, this is not needed in OPRF). As for the general VOPRF draft, I strongly recommend it includes a full specification of 2HashDH. Not only for OPAQUE but for many other applications for which 2HashDH is a better option (simplicity, threshold, assumptions). I also recommend that the VOPRF document includes POPRF for applications that require support for metadata values and cannot afford a per-value public verification key. I hope you can agree with this...
Btw, even if threshold is out of scope for the voprf draft, it is a basis for more advanced constructions, and threshold is an important extension so supporting 2HashDH for that is important.
The one thing I am fully opposed to is to outsource the definition of the OPRF in OPAQUE to the VOPRF document and then only have POPRF in that document.
The bottom line is this: If POPRF does not have an advantage for the OPAQUE protocol (does it?), then we should keep 2HashDH as the default one.
I'm not sure I agree with this, because...
The one thing I am fully opposed to is to outsource the definition of the OPRF in OPAQUE to the VOPRF document and then only have POPRF in that document.
OPAQUE depends on the VOPRF specification. We shouldn't be defining an OPRF just for the purposes of OPAQUE. The purpose of draft-irtf-cfrg-opaque is to be maximally useful for all use cases, of which OPAQUE is one. So that means maybe adding the 2HashDH OPRF back to that document, all for the purposes of aiding threshold deployments. Right now, that doesn't seem like sufficient justification to have two constructions with fundamentally different properties under the same abstraction. We could add back 2HashDH under a different abstraction in draft-irtf-cfrg-voprf. That would just widen the scope of that document, though, which isn't harmful.
All that said, I think the simplest and most pragmatic thing to do is just keep the POPRF. But I think a reasonable compromise is to support both OPRF and POPRF in draft-irtf-cfrg-voprf, under separate abstractions. We can discuss this during IETF 112.
Btw, even if threshold is out of scope for the voprf draft, it is a basis for more advanced constructions, and threshold is an important extension so supporting 2HashDH for that is important.
No disagreement there! I'm simply noting that threshold implementations were not in scope for that specification. We can always change that though. =)
I'd say that if you end up driving people to implement OPAQUE with POPRF (*) you would have made a disservice to these implementers. It has no benefit over 2HashDH for the practice of OPAQUE and a serious barrier for threshold implementations ("not in scope" now, but I hope we will see more of them in the future).
(*) This is exactly what you would be doing if you defined the VOPRF with POPRF only, and define OPAQUE only with hooks to VOPRF.
Well, we've written the OPAQUE spec such that any OPRF can be used, so there is no restriction to POPRF. For example, if someone wants to use a future PQ OPRF, they can do so with no substantial OPAQUE changes.
I agree that in the future people can choose to use other OPRFs, but right now, if you define the OPRF with VOPRF-draft hooks they will use whatever is defined there. And if what is defines there is only POPRF, that's what they will use. I do not recommend it.
I agree that in the future people can choose to use other OPRFs, but right now, if you define the OPRF with VOPRF-draft hooks they will use whatever is defined there.
The OPRF dependency isn't defined this way -- it's meant to accommodate any suitable OPRF, specifically so one can choose to use an OPRF that suits their needs.
Where are implementers supposed to get the specification of 2HashDH if they wanted to use it given its benefits if it is not part of the VOPRF spec?
Where are implementers supposed to get the specification of 2HashDH if they wanted to use it given its benefits if it is not part of the VOPRF spec?
Probably in the same spec which describes how to thresholdize 2HashDH (which doesn't exist). We could write one, though, or add it to the existing doc.
I vote for the latter: Add it to the existing doc, namely, the voprf draft. Writing a new one would postpone it unnecessarily and, frankly, in my opinion, it makes no sense to only have POPRF in a basic OPRF document. Btw, threshold is a dimension where 2HashDH is better than POPRF but I do not see why one would use POPRF in any setting where 2HashDH suffices.
Btw, one can interpret my insistence as pushing "my own stuff". I hope you understand this is a sincere opinion based on technical stuff, particularly as this is NOT my own stuff. It was invented by Chaum 30 years ago. Also, I think the POPRF work is excellent and I am happy to have a partial OPRF which is non-pairing based, and happy that the voprf draft will define it for those that need a partial OPRF. I just don't think that POPRF needs to be the one size that fits all, in prticular, not the best fit with OPRF.
A POPRF with a fixed info string is functionally an OPRF (ignoring threshold implementations), so it is redundant to add both to the same doc. The question I think we're asking here is whether we need to specify a standard version of a threshold OPRF. (As of now, I don't think the benefits of a threshold variant warrant inclusion in the existing doc.)
Functionally speaking, at least in its basic form, POPRF is functionally equivalent to 2HashDH, but formally speaking, we do not have a proof of OPAQUE under the security definition behind POPRF as this proof relies on a UC object - which is also how the very notion of aPAKE is formalized in the analysis of OPAQUE in JKX18. If draft-voprf will have the hooks needed to implement 2HashDH then I see absolutely no reason not to include that instantiation in the document (while I see significant reason to include it).
but formally speaking, we do not have a proof of OPAQUE under the security definition behind POPRF as this proof relies on a UC object
Right, and this is a gap I think we can overcome, as we're doing similar analyses for other CFRG documents.
If draft-voprf will have the hooks needed to implement 2HashDH then I see absolutely no reason not to include that instantiation in the document (while I see significant reason to include it).
I think it complicates the implementation story to have both in the same document, under the same API (or syntax). 2HashDH doesn't support metadata, so would one use the same syntax to describe 2HashDH as one would to describe 3HashSDHI?
It would seem reasonable to me to instead keep these syntaxes separate, and then have instantiations for both in draft-irtf-cfrg-voprf. One syntax for POPRFs implemented based on 3HashSDHI, and another for OPRFs implemented with 2HashDH.
Overcome by #324.
Currently, domain separation on OPRF evaluation is done using the client's record identifier and a global seed to derive a user-specific evaluation key.
(V)OPRF introduces POPRF with metadata that integrates domain separation in the API. This API allows for an
info
value for the server's evaluation function and aninfo
value for the client's finalization function. These two values can be different from one another and the protocol will still execute correctly, but these values must be constant across sessions to yield the same result. This last condition requires the client to possess client-specific discriminating public information that would allow proper domain separation. Theclient_identity
is not currently not required to exist for the client, and can therefore be empty, which makes it a non-reliable candidate for the protocol.What should these values be?