BlockstreamResearch / bip-frost-dkg

49 stars 14 forks source link

Motivation for deterministic key generation unclear #40

Open LLFourn opened 3 months ago

LLFourn commented 3 months ago

I was implementing all hash arguments "just so" and took a moment to ask myself why it needed to be like this and I couldn't figure it out.

From the text this seems to be the current motivation:

ChillDKG constructs a transcript eq_input by appending to the transcript of EncPedPop the vector enc_secshare. This ensures that all participants agree on all encrypted shares, and as a consequence, the entire DKG output of a successful ChillDKG participant can be deterministically reproduced from a secret per-participant seed and the transcript.

This is true but why is this useful. In order for this to recover any secret we'd need all the seeds -- but we are in the threshold setting where we're trying to be robust against loss of individual secrets so this doesn't seem super helpful?

This property is leveraged to offer a backup and recovery functionality: ChillDKG outputs a string called recovery data which is the concatenation of the transcript eq_input and the success certificate cert. The recovery data, which is the same for every participant, can be used by any participant together with the seed to recover the full output of the DKG session.

But the deterministic keygen property doesn't seem to be relevant here. In order to extract your secret share from the recorded transcript (i.e. recovery data) you just need to decrypt the share destined to you and verify the signature. You don't need to deterministically produce anything other than your static secret key.

Is the issue that you don't add your own contribution to your own secret share to the transcript? If so, I think it's much easier to just change that rather than forcing determinism on the whole key generation.

jonasnick commented 3 months ago

You're right that deterministic key generation isn't a necessary consequence of the ability to recover from seed and recovery data.

ChillDKG is not deterministic. participant_step1 takes a fresh randomness argument that is used to derive the secret shared by the participant.

We should probably just remove "deterministically" from this sentence:

the entire DKG output of a successful ChillDKG participant can be deterministically reproduced from a secret per-participant seed and the transcript.

It's correct, but it gives the impression that the entire key generation is deterministic. The transcript is non-deterministic.

In order for this to recover any secret we'd need all the seeds

To recover a participant's secret, we just need their seed.

we are in the threshold setting where we're trying to be robust against loss of individual secrets so this doesn't seem super helpful?

If an individual secret is lost, you find yourself in a t-of-n-1 setup. In most cases you would want to achive your original t-of-n setup again (to avoid a situation where losses over time accumulate to n < t). Without backups, this requires a new ChillDKG run which needs cooperation from all participants and on-chain transactions to move all coins from the old to the new wallet. This is particularly difficult in a federation of autonomous and mutually distrusting signers operated by different parties.

LLFourn commented 3 months ago

Ok thanks I understand what you are going for now I think. I think I can restate my question in a more precise way as a response to:

To recover a participant's secret, we just need their seed.

What about if all we need is their secret decryption key rather than this "seed". The concept of a seed could be totally removed. If we just include your "to self" secret vss input encrypted to yourself in the aggregated ciphertexts then anyone with just the decryption key can recover the share. This means the spec can totally leave out of scope how you generate randomness for VSS, encryption nonces, the encryption keys themselves and so on while maintaining the same recoverability property.

I noticed this because I was diligently implementing all the tagged hashes and prfs inputs etc as closely as I could to the spec etc but then stopped myself because I couldn't figure out why I was doing it. I replaced seed: [u8;32] with rng: &mut impl rand_core::RngCore and everything became quite a bit simpler but seemed to be able to do everything the scheme could do before.

jonasnick commented 3 months ago

What about if all we need is their secret decryption key rather than this "seed".

Is this is a per-session decryption key or long-term decryption key? What exactly does this improve and how? Would it allow the recovery data to remain public (mod privacy) and ensure agreement?

Something like this could be built from EncPedPop.

I replaced seed: [u8;32] with rng: &mut impl rand_core::RngCore

This means we need to establish an entirely new PKI from scratch for CertEq for every DKG instead of being able to reuse long term "host" public keys derived from the seeds.

jonasnick commented 3 months ago

I replaced seed: [u8;32] with rng: &mut impl rand_core::RngCore

And backing up a new secret for every session.

LLFourn commented 3 months ago

Is this is a per-session decryption key or long-term decryption key? What exactly does this improve and how? Would it allow the recovery data to remain public (mod privacy) and ensure agreement?

The long-term decryption key. It's the "host key". The goal is to simplify the specification and to stop it being opinionated about how to generate all the random per keygen inputs. All aside from the "host key" can just be pulled from rng of the implementor's choosing.

Something like this could be built from EncPedPop.

Not as its written since it doesn't include the encrypted secret share contribution destined for yourself from yourself in the transcript.

This means we need to establish an entirely new PKI from scratch for CertEq for every DKG instead of being able to reuse long term "host" public keys derived from the seeds.

Why? I am trying to say that there should be no "seeds" as private input to the protocol at all. Just long-term host keys. Anyone with the host secret key can decrypt the share from the recovery data. The spec doesn't have to care where the application gets the host key from. Maybe it's generated and stored on an HSM in some hardware specific way.

real-or-random commented 3 months ago

(Take this with a grain of salt, I'm out of office and only spent a moment thinking about this.)

The concept of a seed could be totally removed. If we just include your "to self" secret vss input encrypted to yourself in the aggregated ciphertexts then anyone with just the decryption key can recover the share. This means the spec can totally leave out of scope how you generate randomness for VSS, encryption nonces, the encryption keys themselves and so on while maintaining the same recoverability property.

Your observation is essentially correct. We have a built-in PRF and derive everything from a single random seed. One way to look at this is (a bit) like BIP340 signing. We give the implementer a proper way to derive all the randomness. But we can't force anyone to follow the spec when it comes to the randomness. If the implementer deviates from this and uses their own RNG, this is fine in the end. (BIP340 is still different because it can get away without any randomness...)

(Yes, then participants will need to encrypt their "to self" share, but I don't think this comes with any disadvantages.)

Though, the truth is that things are as they are because this is how the spec evolved. We made the protocol randomness entirely ephemeral only a while ago. I don't think it was a super deliberate decision to derive everything from a single seed, so it's a good point that you bring up.

I think one advantage of the status quo is that it's not the end of the world if you reuse the seed. We'll need to think about this again, but as far as I remember, the conclusion was that reusing the seed is not covered by the security proof in the paper, but it should be okay in practice. (You know, famous last words of the applied cryptographer.) This is partly due to how we derive things deterministically, and in particular what context information we throw into the hash/RNG as some form of defense in depth. If we let the user pass an RNG instead and this RNG is broken, it may for example happen that they'll reuse the encryption pads for different messages. This shouldn't happen in the current form of the protocol.

Another advantage of deriving everything from a single seed is that it will simply test vectors. And these will already be complex enough for an interactive protocol.

A last thing we had considered is that with the current protocol, you could in principle recover the entire VSS. This may or may be helpful in protocol extensions and/or debugging. (But yeah, this is really a very minor thing.)

Perhaps a way in the middle is simply to state that the randomness can be generated differently. This will be again like BIP340. We'll just need to indicate precisely what inputs constitute "randomness" because there are many places where we use randomness. (I'm sure there's some way to indicate precisely what all is "randomness", e.g., some Python type.)

The long-term decryption key. It's the "host key". The goal is to simplify the specification and to stop it being opinionated about how to generate all the random per keygen inputs. All aside from the "host key" can just be pulled from rng of the implementor's choosing.

This is slightly related to https://github.com/BlockstreamResearch/bip-frost-dkg/issues/28 by the way.


From the text this seems to be the current motivation:

ChillDKG constructs a transcript eq_input by appending to the transcript of EncPedPop the vector enc_secshare. This ensures that all participants agree on all encrypted shares, and as a consequence, the entire DKG output of a successful ChillDKG participant can be deterministically reproduced from a secret per-participant seed and the transcript.

This sentence doesn't say anything about the protocol using an internal PRF for deterministic, derivation. This just states the DKG output can be reproduced deterministically. (Decryption is a deterministic operation!). We should perhaps omit the word "deterministically" if it's confusing in this context.

jesseposner commented 3 months ago

If an individual secret is lost, you find yourself in a t-of-n-1 setup. In most cases you would want to achive your original t-of-n setup again (to avoid a situation where losses over time accumulate to n < t). Without backups, this requires a new ChillDKG run which needs cooperation from all participants and on-chain transactions to move all coins from the old to the new wallet. This is particularly difficult in a federation of autonomous and mutually distrusting signers operated by different parties.

A last thing we had considered is that with the current protocol, you could in principle recover the entire VSS.

I don't think it changes the analysis here, but just wanted to mention there are some alternative methods available to handle these scenarios.

The share repair protocol can be run such that a threshold of participants can help another participant recover their lost share. In addition, in the event of a lost share, it's often a good idea to run the refresh protocol to defend against an attacker who recovers the lost share (e.g. if the lost share is on a device that was physically lost). These protocols are off-chain and do not require a sweep. So I think in practice, we will likely see repair+refresh used to handle lost shares in federations.

For recovering the VSS, a threshold of public verification shares can be used to derive the VSS, and a threshold of secret shares can be used to derive a threshold of public verification shares.

LLFourn commented 3 months ago

With the context provided by @real-or-random my campaign can be reduced to including the "to self" encrypted share in the agg ciphertext and stressing the use of the host secret key to recover the output rather than a "seed" (they might be the same thing #28). Using a PRF to ease writing the spec tests is a fine approach I think. In practice though I'd encourage the PRF to be generated with true fresh randomness in addition (same as BIP340).

A last thing we had considered is that with the current protocol, you could in principle recover the entire VSS. This may or may be helpful in protocol extensions and/or debugging. (But yeah, this is really a very minor thing.)

I think all the spec needs to do here is have the source of randomness passed in. The PRF works well for this. Repeating the same one with exactly the same inputs will get you the same outputs. Keep in mind as @jesseposner mentions you can recover the entire VSS output state for each party in principle with only t shares which does seem strictly better than needing n things.

From the text this seems to be the current motivation:

ChillDKG constructs a transcript eq_input by appending to the transcript of EncPedPop the vector enc_secshare. This ensures that all participants agree on all encrypted shares, and as a consequence, the entire DKG output of a successful ChillDKG participant can be deterministically reproduced from a secret per-participant seed and the transcript.

This sentence doesn't say anything about the protocol using an internal PRF for deterministic, derivation. This just states the DKG output can be reproduced deterministically. (Decryption is a deterministic operation!). We should perhaps omit the word "deterministically" if it's confusing in this context.

Because of the missing to-self encryption you do have to use internal PRF for deterministic derivation in precisely the way specified. I don't think deterministically is wrongly used or confusing it just shouldn't require deterministic derivation.

jonasnick commented 3 months ago

Something like this could be built from EncPedPop.

Not as its written since it doesn't include the encrypted secret share contribution destined for yourself from yourself in the transcript.

Right. I opened PR #43 to do that. Is this what you're suggesting?

PR #43 makes the code a bit simpler, increases communication slightly and does not increase the size of the eq_input (aka "transcript") because the sum of all encshares had already been included. I think that this is a reasonable argument that encrypting the "self share" to yourself is worth doing.

I'm not quite convinced by @llfourn's argument for doing this though. Deriving secrets via a PRF instead of obtaining the secrets via an RNG is a common defense-in-depth measure that protects against broken RNGs. Moreover, using an RNG can make exhaustive test vectors more difficult (although this depends on how the test vectors look like exactly; maybe it can be avoided).

LLFourn commented 3 months ago

Something like this could be built from EncPedPop.

Not as its written since it doesn't include the encrypted secret share contribution destined for yourself from yourself in the transcript.

Right. I opened PR #43 to do that. Is this what you're suggesting?

Yes! thanks.

I'm not quite convinced by @LLFourn's argument for doing this though. Deriving secrets via a PRF instead of obtaining the secrets via an RNG is a common defense-in-depth measure that protects against broken RNGs. Moreover, using an RNG can make exhaustive test vectors more difficult (although this depends on how the test vectors look like exactly; maybe it can be avoided).

To be clear, I agree that the PRF API is the best way to go about this spec. I just wanted to point out that with the changes in #43 an implementation can use true randomness in addition to secrets and public inputs to sample the PRF and this is a good thing since it adds to defense-in-depth.

real-or-random commented 1 month ago

It's a bit off-topic, but let me point out that I read @FiloSottile's newsletter last week, in which argues for deterministic derivation for similar reasons that were brought in this issue, e.g., testability.

And there's this paragraph:

It’s tempting to say “just seed a stream cipher and use that as the CSPRNG” but that adopts the internals of the stream cipher, of how it’s instantiated, and how it’s used as a CSPRNG into your protocol, which might be a much worse outcome than a deliberate design. For example, the ZCash Powers of Tau MPC ceremony needed to generate a number of elliptic curve points from a seed, so they just called ChaChaRng::from_seed, got a Rust Rand trait implementation, and passed it down to Fq::Rand. Then they asked me to reimplement the protocol in Go. The result was… fun.

This really reminded me of @LLFourn's comment above:

I noticed this because I was diligently implementing all the tagged hashes and prfs inputs etc as closely as I could to the spec etc but then stopped myself because I couldn't figure out why I was doing it. I replaced seed: [u8;32] with rng: &mut impl rand_core::RngCore and everything became quite a bit simpler but seemed to be able to do everything the scheme could do before.

[1] I hope it's appropriate to @mention you and generate a notification here... :)

LLFourn commented 3 weeks ago

Nice article. The first paragraph is good response to the second. Having multiple implementations produce the exact same transcript given the same inputs is a noble goal. I think to do this well we might want to model the context a bit more formally in the code so it's easy to implement and debug.

For inspiration see: https://docs.rs/merlin/latest/merlin/struct.Transcript.html which formally models the transcripts for sigma protocols. I think having a python Transcript class in the code where you add things and can pull out random things and the DH pad could work. This is nice because in the tests you can print out the full transcripts thus far and compare with your own implementation and see where they differ.

real-or-random commented 2 weeks ago

For inspiration see: docs.rs/merlin/latest/merlin/struct.Transcript.html which formally models the transcripts for sigma protocols. I think having a python Transcript class in the code where you add things and can pull out random things and the DH pad could work. This is nice because in the tests you can print out the full transcripts thus far and compare with your own implementation and see where they differ.

Hm, interesting idea. I was aware of the abstraction in merlin, but I didn't consider using something like this here. So far, we were careful not to rely on advanced Python features too much (well, if you can call a class advanced), to keep the code in "executable pseudocode" form which remains accessible for people not familiar with Python. But it's certainly worth considering, and OOP is common enough that this probably won't create problems for readers, as long as we don't rely too much on Python specifics.

This is somewhat related to https://github.com/BlockstreamResearch/bip-frost-dkg/issues/50.