BlockstreamResearch / bip-frost-dkg

15 stars 7 forks source link

Decoupling parties providing shares from parties receiving shares #2

Open LLFourn opened 8 months ago

LLFourn commented 8 months ago

It would be useful if the set of parties providing input polynomials could be different from those providing output shares. Examples:

  1. Overcontributing randomness: n parties are receiving shares but n+1 parties provide input because another party (call it the coordinator) wants to contribute to the key generation in case more than t share receiving parties are malicious. Even if all shares end up in malicious hands, they still may be unable to do anything malicious if they physically cannot communicate with each other or the outside world without going through the coordinator.
  2. Undercontributing randomness: n parties are receiving shares but at least one party in t parties is assumed to be honest so we can just ask an arbitrary t parties to generate the key and distribute shares. I claim that only one party being honest is enough for a key generation to be secure. This is more efficient but there might also be other reasons like some of the parties are not online to receive the shares at the moment.

Intuitively it makes sense why only 1 honest party is sufficient: The actual key you use still has a contribution from an honest party and the malicious parties have less than <t shares for that honest contribution. I think if you look at the structure of the proof in Practical Schnorr Threshold Signatures Without the Algebraic Group Model it also implies this. The simulation chooses one honest party to be the keygen simulator. In fact I'd guess the proof would be simpler if you always assume only one honest party in the keygen because there would be less "window dressing" around simulating the other honest parties. Important to note that this paper couples the security proof of signing and keygen so changing this requires carefully looking at the implications elsewhere.

The most general goal is make the two sets (input providers and share receivers) completely distinct where it is secure as long as 1 input provider is honest and at most t-1 share receivers are malicious. Not sure whether the proof in the "practical" paper will lend itself to such generality though.

jonasnick commented 8 months ago

Undercontributing randomness: [...] This is more efficient but there might also be other reasons like some of the parties are not online to receive the shares at the moment.

Interesting idea. While I can at least intuitively follow why it would work, I don't fully understand the use case. The parties must be online to receive the shares anyway to decrypt their shares and run the Eq protocol. Do you mean that they may not be online at share generation time? That would make sense to me but it's a tiny improvement that may not justify additional complexity (in particular since currently our design approach does not optimize for communication complexity).

However, as you mention, this may not require much additional complexity if we implement the "overcontributing randomness" idea via making input providers and share receivers completely distinct.

LLFourn commented 8 months ago

Undercontributing randomness: [...] This is more efficient but there might also be other reasons like some of the parties are not online to receive the shares at the moment.

Interesting idea. While I can at least intuitively follow why it would work, I don't fully understand the use case. The parties must be online to receive the shares anyway to decrypt their shares and run the Eq protocol. Do you mean that they may not be online at share generation time? That would make sense to me but it's a tiny improvement that may not justify additional complexity (in particular since currently our design approach does not optimize for communication complexity).

They may be offline completely during the whole process. The goal is to have weak kind of robustness. They may receive their shares later by requesting it from a few parties. In many real-world deployments communication happens between some redundant central event log system from which you can retrieve messages on a certain topic (nostr is kind of an example). For example, you want to generate 4-of-9 but and you still want the keygen to work even if only 7 nodes are online right now. The parties communicate through a central coordinator where you can send authenticated messages to all the other parties which will stay in their inbox until they are retrieved. Once the coordinator sees the fourth keygen polynomial input it will declare it finished and nodes may collect their encrypted shares and build their FROST key share. When at least 7 nodes (which must include at least 4 honest) declare they have retrieved their key share for the new key it becomes ready to use. Of course the two offline nodes may have invalid shares but this may be tolerable since we know 4 honest parties have valid shares. Handwaving some things here but I do believe this may find practical application in doing FROST custody for services like exchanges etc.

I think the example above also highlights something interesting. It may not necessarily be the parties receiving shares that are doing the Eq check. It could be a more user-facing part of the system that is checking in the event log that at least 7 nodes have reported having shares under the polynomial derived from the four inputs (earlier events) and therefore the key is ready to use and addresses are ready to display to the user. The nodes receiving shares may never check this themselves.

jonasnick commented 8 months ago

For example, you want to generate 4-of-9 but and you still want the keygen to work even if only 7 nodes are online right now.

I'm not really convinced of the usefulness of this feature. If I'm fine with a 4-of-7, I could just do a 4-of-7. So if I understand correctly, this feature only makes sense if I'm fine with a 4-of-7 but I prefer a 4-of-9. However, before sending coins to an address derived by the shared public key, wouldn't I want to know whether n is 7 or 9? If so, I need all 9 nodes to be online before using the address.

Could you explain how this would be useful for custody/exchanges?

It may not necessarily be the parties receiving shares that are doing the Eq check

It should be possible to use SecPedPop and the certifying network-based Eq protocol to support this - at least how they are currently specified. The "hostkeys" used in the Eq protocol could be set to some parties doing the Eq check that are not necessarily the signers.

LLFourn commented 8 months ago

I'm not really convinced of the usefulness of this feature. If I'm fine with a 4-of-7, I could just do a 4-of-7. So if I understand correctly, this feature only makes sense if I'm fine with a 4-of-7 but I prefer a 4-of-9. However, before sending coins to an address derived by the shared public key, wouldn't I want to know whether n is 7 or 9? If so, I need all 9 nodes to be online before using the address.

Could you explain how this would be useful for custody/exchanges?

I don't have a deeper insight other than my after consulting my feelings they say 4-of-7 might good enough in the short term when 4-of-9 is desirable long term if it means you can tolerate some temporarily offline parties during key generation. The risk of loss is a risk taken over time after all. Hopefully we can get some people who want to use FROST in a custodial setting to assess this better. I agree that by itself this use case is pretty niche and doesn't warrant a lot of work in the spec to accomidate it if it doesn't happen naturally.

Worth noting the above scenario might be addressed better with an enrollment protocol i.e. you design the system to do the 4-of-7 with whoever is online right now and then "repair" the situation to end up with a 4-of-9 whenever the nodes who missed out come back online.

My main input is that if we are sure that only 1 honest party is needed to contribute input polynomials then we should probably not arbitrarily specify more than that as there may be unforeseen use cases that could make use of this relaxation. A decent example is where you just want to do a trusted dealing -- this decoupling would allow that scenario to fit perfectly into the spec and any software that implements it.

It might actually be simple to specify also because it allows you to split up the "roles" in the keygen and deal with their requirements individually:

  1. Keygen input providers (1 must be honest)
  2. Keygen share receivers (< t must be dishonest)
  3. Keygen validators -- check enough "keygen share receivers" have got correct shares against the same keygen hash to satisfy application policy.

It should be possible to use SecPedPop and the certifying network-based Eq protocol to support this - at least how they are currently specified. The "hostkeys" used in the Eq protocol could be set to some parties doing the Eq check that are not necessarily the signers.

Yes. I think the main question is do we want to specify/test particular ways of doing (3) like having multiple parties certify with signatures or reduce scope to just explaining the considerations involved when designing the "keygen validator" role in your application.

jonasnick commented 7 months ago

Some notes after discussing with @real-or-random:

Some keygen share receivers are offline (i.e. the 4-of-7 -> 4-of-9 example)

One practical question in that scenario is to whom the shares are encrypted if signers 8 and 9 have never been online to share their encryption key.

That issue aside, in RecPedPop all participating signers receive the encrypted shares of all other signers (including those of signers 8 and 9). Maybe it would be possible to use a publicly verifiable secret sharing scheme such as Schoenmakers 99 to allow signers 1 to 7 to verify the shares of signers 8 and 9.

We agree that if this doesn't work or specifying it in the BIP is complicated or has other drawbacks, it makes sense to consider using the enrolment protocol to onboard the new signers.

Undercontributing randomness

I think the fact that a trusted dealer mode would be covered by this is quite interesting but also does not really fit the scope of this DKG BIP. We should check how the complexity of the BIP is increased if we support this. In particular, some additional optimizations may be possible in a trusted dealer mode that would justify having entirely separate "DKG" algorithms for that mode.

Overcontributing randomness

@real-or-random and I realized that there is still a potential covert channel: signing devices can just refuse to display certain addresses and provide a reasonably looking error to the user. Users would then regenerate addresses until the signing devices are happy. This allows the signing devices to bias the scriptPubkeys and exfiltrate secret key material. The implication is that the software wallet ("host") must insist on using all requested derivation paths and somehow communicate to the user how to proceed in case the signing devices don't play along (even if it looks like a bug). This is somewhat similar to the anti-exfil setting when the signing device refuses to sign with certain host provided randomness (except that address derivation is deterministic whereas signatures are randomized from the point of view of the host).

Hence, supporting overcontributing randomness in the BIP does not necessarily reduce the risk of secret key exfiltration. Rather, it ends up being a UI issue that may or may not be sufficiently solvable. It is possible to argue that if users actually want to reduce the risk, they should employ signing devices from different manufacturers. But if overcontributing randomness is straightforward to specify in the BIP it may still be worth doing.

tl;dr: We should check if PVSS is possible and how much complication under- and overcontributing randomness would add to the current version of the BIP.

nickfarrow commented 7 months ago

Re: potential covert channel when overcontributing randomness,

I see two separate attacks here:

1) The secret is predetermined by malicious devices

By including randomness from the coordinator we aim to remove the possibility of 1). Even if all signers were compromised with the intention to produce a predetermined secret via the DKG, the honest coordinator/host can verifiably include entropy that enforces creation of a random secret.

2. The secret is leaked by malicious devices

After a brief discussion with Lloyd, we think 2) is unrealistically difficult. FROST prevents the low hanging fruit of this attack with nonce commitments, but the more general form of this attack may be impossible to prevent, for any scheme.

The argument being that if signers have the option to either sign or not sign each particular message that gets broadcast. Then a malicious signer can always selectively choose which messages they cooperate in signing such that the intended secret is leaked through these messages that are being broadcasted publicly.

Peter Wuille mentions this here: 2.e) Failure bias

If HW is allowed to fail 255 times out of 256, it can introduce an 8-bit bias

I read your covert channel as leaking the secret across multiple addresses, which seems more realistic rather than trying to transmit a big chunk of some secret at once and failing constantly.

If you limit signing failure to 50-50 (Lloyd thought via the right bit of the transaction locktime) you'd expect a roughly equal amount of sign failures for every bit you transfer. The user would experience an incredible amount of signing failures before any meaningful portion of the secret is leaked.

With FROST, if the hardware devices are unable to communicate to one another then you'd also need a threshold number of corrupted devices to each leak their secret share via competing signing success/failures.

jonasnick commented 7 months ago

2) is unrealistically difficult

Do you mean unrealistically difficult for an attacker to pull off? How so? Because the user would notice a 50% failure in signing? In my experience, normal users are able to live with small, annoying "bugs" for a very long time because it's faster to just retry the operation than to investigate the issue. The attacker also could halve the failure probability and only communicate one bit in two signatures, etc.

Therefore, in the case of anti-exfil, the software wallet would actually have to notice that over the lifetime of the HWW there have been too many signing failures and therefore mark the device as broken/malicious. The "address covert channel" is yet another way to exfiltrate (extra) secret bits that appears to be harder to prevent by the software wallet.

jonasnick commented 7 months ago

With FROST, if the hardware devices are unable to communicate to one another then you'd also need a threshold number of corrupted devices to each leak their secret share via competing signing success/failures

We assume that signers are uniquely identified by their index, so they can agree without communication that the signer at index 0 starts exfiltrating first.

nickfarrow commented 6 months ago

I meant unrealistic for the user tolerating a failure rate that is sufficiently useful for an attacker. But good point that the device could use an even lower failure rate than 50-50 which the user might tolerate.

There is the added difficultly for the attacker to identify the affected script pubkeys (/transactions), and in the correct order. But we can not guarantee this to be difficult, trivial if the attacker also has an xpub.

It would be good to quantify this attack, you could have the coordinator/wallet warn users if there have been a suspicious number of failures out of total number of signatures. But attackers could work around any detection attempts..

Discrepancies between devices should bring extreme caution, if there are inexplicable failures in address verification or transaction signing errors which don't make sense. But communicating when this is appropriate could be difficult, there will be authentic device errors at some point.

Very interesting. It affects all hardware devices FROST or not. I think it would still be worthwhile for us to add extra randomness into the DKG to prevent the predetermined secret attack. It is a much simpler.

LLFourn commented 6 months ago

Some notes after discussing with @real-or-random:

Some keygen share receivers are offline (i.e. the 4-of-7 -> 4-of-9 example)

One practical question in that scenario is to whom the shares are encrypted if signers 8 and 9 have never been online to share their encryption key.

That issue aside, in RecPedPop all participating signers receive the encrypted shares of all other signers (including those of signers 8 and 9). Maybe it would be possible to use a publicly verifiable secret sharing scheme such as Schoenmakers 99 to allow signers 1 to 7 to verify the shares of signers 8 and 9.

Hmm I think Shoenmaker's scheme is not PVSS for scalar secret shares but rather it publicly shares secret group elements. Like two mappings one using G and one using H as the basepoint. You construct polynomial with base H them and provide DLEQ to show that the encrypted group elements with base G are also correct (without opening them). Not sure how to apply this to scalar scecret shares -- indeed if it were applicable I feel it would mean scalar verifiable encryption was easy (but it isn't!).

Undercontributing randomness

I think the fact that a trusted dealer mode would be covered by this is quite interesting but also does not really fit the scope of this DKG BIP. We should check how the complexity of the BIP is increased if we support this. In particular, some additional optimizations may be possible in a trusted dealer mode that would justify having entirely separate "DKG" algorithms for that mode.

Sure. I am making the proposition on the grounds that splitting the document into roles would make specification simpler and more flexible. I'll have a go at restructuring things in this way to show what I mean.

Overcontributing randomness

@real-or-random and I realized that there is still a potential covert channel: signing devices can just refuse to display certain addresses and provide a reasonably looking error to the user. Users would then regenerate addresses until the signing devices are happy. This allows the signing devices to bias the scriptPubkeys and exfiltrate secret key material. The implication is that the software wallet ("host") must insist on using all requested derivation paths and somehow communicate to the user how to proceed in case the signing devices don't play along (even if it looks like a bug). This is somewhat similar to the anti-exfil setting when the signing device refuses to sign with certain host provided randomness (except that address derivation is deterministic whereas signatures are randomized from the point of view of the host).

Hence, supporting overcontributing randomness in the BIP does not necessarily reduce the risk of secret key exfiltration. Rather, it ends up being a UI issue that may or may not be sufficiently solvable.

Well it probably does reduce the risk of key exfiltration. Attacks that assume things about user behavior are always going to be harder to pull off than those that don't. The amount of risk reduction is up for debate and is not something that can be known deductively. Users are incredibly tricky beasts (luckily I never have any!).

I think it's fine to say to implementors of this BIP "Even if you do all this right and use the flexibility offered to mix in extra polynomials from other devices you will still have to defend against attacks on user behavior". The simplest attack I can think of that relies on user behavior is get the malicious firmware to pop on the screen: "ERROR occurred please report by scanning code <QR CODE that include master secret in base64 in the url query string>". I'd predict this would net more coins than user rejection sampling addresses or signatures to leak single bits at a time. User education and UI design have to contend with these problems and we need to explain them better in general and specifically any new ones introduced by this BIP but the existence of these attacks does not devalue tools to disrupt malicious key generators. On the contrary it's the other way around: there is no point in doing user education and careful UI design if a group of malicious devices can contrive a malicious key without the user's help.

On the specifics of the BIP I think we don't strictly need to include this bit because at the implementation level you can just say "encrypt a garbage shares to this ID" and this way one party contributes a polynomial but doesn't get anything in return and doesn't complain. I think there is value in including and encouraging adding extra devices to input polynomials and to be explicit about how and why you would do that.

With FROST, if the hardware devices are unable to communicate to one another then you'd also need a threshold number of corrupted devices to each leak their secret share via competing signing success/failures.

Except that in our design they are daisy chained together so they can!

jonasnick commented 6 months ago

I think Shoenmaker's scheme is not PVSS for scalar secret shares but rather it publicly shares secret group elements.

Hmm, right. Now after skimming their paper it seems like the techniques are not applicable. For EncPedPop as currently specified it should be possible to use a generic ZK scheme for the relation

{ G ; enc_share, receiver encryption key enckey_r, receiver index i, dealer encryption key enckey_d, VSS_commitment; dealer decryption key deckey, polynomial f:
  commit(f) = VSS_commitment,
  enckey_d = deckey*G,
  enc_share = f(i) + ECDH(deckey, enkey_r) }

It's not so easy due to the hashing in ECDH, but maybe worth an extension one day.

"ERROR occurred please report by scanning code ".

That's a good one.