Coin flipping protocol - Pedersen VSS setup

pdyraga commented 6 years ago

We need to figure out how to generate g and h parameters used for discrete log problem in Pedersen VSS.

This card covers research and pseudocode implementation of the setup protocol.

madxor commented 6 years ago

The coin flipping protocol research consisted of analysis of the following scientific papers:

"Coin Flipping by Telephone A Protocol for Solving Impossible Problems"
"Parallel Coin-Tossing and Constant-Round Secure Two-Party Computation"
"Collective Coin Flipping"
"Protocols for Multiparty Coin Toss with Dishonest Majority" and Extended
"Coin Tossing is Strictly Weaker than Bit Commitment"
"An Optimally Fair Coin Toss"
"Limits on the Security of Coin Flips When Half the Processors are Faulty" Extended
"Martingales Collective Coin Flipping and Discrete Control Processes"

The most simple realisation of the coin-flipping protocol is Blum's:

• Alice chooses a random bit a and sends a commitment c = commit(a) to Bob.
• Bob chooses a random bit b and sends it to Alice.
• Alice sends the bit a to Bob together with de-commit(c).
• If Bob does not abort during the protocol, Alice outputs a ⊕ b, otherwise she out-
puts a random bit.
• If Alice does not abort during the protocol and c is a commitment to a, then Bob
outputs a ⊕ b, otherwise he outputs a random bit.

As we can see it is pretty simple protocol but we need to have in mind two things:

It is only two-party, multiparty versions are rare and much more complex.
The network costs are high, if we want to generate n-bits in m-party environment we need to send at least n*m messages. In case we would like to generate 1024-bits random value then the number of transmitted messages will be significant. Also, we need to mind that every message needs to be properly cryptographically signed and verified, so we will have also high computing overhead.

During the analysis we have came to the conclusion that implementing coin-flipping protocol might be an overkill for our needs. Instead of adding such complex solution, that would take 2-3 weeks to implement (for multiparty case), we can implement following protocol that would provide us with much more efficient solution:

For every party in the group:
1. Generate a uniformly random number.
2. Generate and broadcast a checksum as concatenation [party_id, random_number, nonce].
3. Wait till all parties broadcast theirs checksums or timeout.
4. Broadcast random_number and nonce.
5. Wait till all parties broadcast theirs random_number and nonce or timeout.
6. Verify if the checksum is correct for every received random_number, if it is not, mark that party_id as malicious.
*7. Broadcast malicious and timeouted party_ids.
*8. Wait for all or timeout.
*9. If more than half of the parties showed a particular party_id as malicious or timeouted, mark it accordingly.
10. Calculate the final random value as a `xor` of all non-malicious and not-timeouted random_numer.

Steps 7, 8, and 9, marked with a star, are an optional way to protect the group from the "split-horizon" attack where two groups have different views of the malicious and timeouted parties. The reason why it might is an option is related with a higher probability of a denial of service.

Finally, it might be beneficial to add an additional steps where confirm that the final random value is the same for everyone.

11. Broadcast random_number.
12. Count the number of the same, received broadcasted random_numbers.
13. If more than half of parties have the same random_number, end with success.
14. If my random is different then the broadcasted one and more than half of the parties have the same, use it and end with partial success. [this step needs more consideration]
15. Otherwise end with failure.

The complexity of the proposed protocol might be a bit simplified when the network layer would provide a message delivery confirmation and timeout handling. But still, this is a proposal that requires discussion and a bit of alignment.

pdyraga commented 6 years ago

Some questions here:

From all the papers you analyzed, is the coin flipping protocol always about operating on single bits? So if we need, for example, to jointly generate a 512-bit integer, we need a magnitude of 512 rounds?
For our needs, what's wrong with a protocol when to generate 1024-bit integer, we expect all group members to:
```
   1. Generate 1024 bits
   2. Publish SHA-256 commitment to the choice
   3. Reveal the choice, evaluate commitment
   4. Evaluate the final result as XOR of all choices

    (*) If we detect malicious behaviour or member is inactive, we remove it 
    from the group and the member does not take part in DKG protocol.
```
We need to take into consideration the following factors:
- This number is significant only for a few seconds; we don't need a strong commitment scheme like Pedersen (to be honest, I started to wonder if we really need Pedersen in DKG if we want to complete the round in just a few seconds), SHA-256 should be enough,
- If we detect misbehaviour we can exclude the member from the group and either drop the share or restart the protocol without that member.

madxor commented 6 years ago

Coin flipping is always binary cause a coin has only two possible states. (not counting other a bit less probable ones ;)
I'm assuming that you're asking about the following protocol. This is simplified version, we have been discussing in the morning. There is nothing wrong with that approach. Maybe I've overcomplicated things in my previous comment cause I wanted to include as many belts and suspenders as possible. I have some doubts about sending just a SHA-256 due to the fact that we will get the same checksum for the same random number. And if someone would listen for a long time all the conversations, he could map checksums to the random numbers. That would be an advantages we need to consider if we would like to have or not.

Shadowfiend commented 6 years ago

If we're worried about people observing away the safety of an SHA-256 hash of a 1024-bit random number, they might as well just brute-force the SHA-256 hashes of all possible numbers [0,2^1024], no?

What Piotr is describing, perhaps with some additional tweaking, seems like a 1024-parallel version of the Blum protocol, to some extent?

Finally, regarding the alternate approach with commit-reveal, it feels like it could give us what we need… Particularly since anyone who times out can be excluded from the remaining key generation process. The question is, if they bias g or h, do they gain significant influence over the outcome of the DKG?

madxor commented 6 years ago

Brute-force is an option (expensive), therefore I've suggested some salting.

Nope, It's not 1024-paralleled version of Blum protocol, cause we are not verifying every bit in paralel. It is somewhat analogous to the Blum protocol.

It depends on the bias, but I don't think that it is realistic. Let's assume that we have two parties, honest and malicious, it is hard to imagine that one can influence the outcome of xor when honest will provide random data and malicious deterministic. Malicious can only influence the result if he can predict the random data coming from the honest party.

If we add quantum information theory here, then the situation will change but we are not there yet.

pdyraga commented 6 years ago

I don't think the real threat is that someone will prepare a rainbow table with all possible 1024-bit input SHA-256 combinations, especially that we can expect the commitment to be hashed with a group identifier or some other information.

Therefore, the adversary needs to brute force. If we take into account that we want this number generation protocol to complete in 1-2 seconds, there is no way, someone can brute-force in such a period.

The only way to bias the resulting number is to resign at the end of the reveal round. This is the same situation as the one mentioned by @Shadowfiend on flowdock:

This, however, has a drawback that, in practice, one cannot enforce that this revelation act happens precisely at the same moment for all parties involved. The last one to reveal has, in fact, total control on the outcome, which can be exploited by a malicious player. Even if we make the parties commit on their numbers, the last one still has an option of never revealing it, thus effectively introducing a bias to the final result.

However, if you resign and don't provide your number, you are excluded from the group and don't run DKG. OK, there is a possibility you cooperate with someone and someone staying in the group can exploit the biased result.

However, the way we'll use h is as follows:

If I don't have 100% control over the result and I did not solve the equation starting from discrete logarithm back to g and h the only threat is that I'll be lucky enough and the group will draw the "right" number for which I reversed-engineered logarithm. I believe it's 1:2^1024 possibility no matter how many adversaries are in the group (we use XOR, not OR or AND - that's important) and no matter if adversary leaves or stays in the group. We may go with 2048-bit number if 1:2^1024 is not enough for us.

By the way, isn't it exactly the same problem as if we had those numbers shipped by a trusted oracle?

pdyraga commented 6 years ago

@madxor Actually, can we run a small simulation on paper? Assume the worst scenario: adversary knows the discrete log. What is the impact for DKG? Are we fine until the threshold is not exceeded?

madxor commented 6 years ago

Remark: When I'm describing a potential attack vector, in most cases, I'm assuming the worst case scenario, combined with some edge cases. Most of them are not very likely but nevertheless they need to be discussed.

An adversary does not need to generate a rainbow-table with all possible 1024-bit hashes. He just needs to map enough hashes. It might be significantly less when we assume a faulty random number generators (OS related) or a birthday attack. This can be easily mitigated by adding an additional random value, like a group identifier (if it is random enough).
In the protocol I'm suggesting, if someone is unwilling to reveal his values in the "disclosure" phase, he will be marked as timeouted and his influence on the final result will not be taken into account.
Also it is important to note, that due to the fact that we have only one round of generating random numbers the way to force the final result to be a particular number, (lets assume 0) is to have 50% of malicious peers with foreknowledge, that know the random values of the other half of the peers before the end of the commitment/checksum broadcasting phase, and are able to set theirs own values as the same as the other half (A xor A = 0).
If an attacker would have a control over r, effectively he would control h cause h = r^k mod p where k = (p - 1)/q, and p, q are known cause are the parameters of the system. Going further, the attacker could simply pick an a that g^a = h which effectively is the solution to log_g(h). If he knows the logarithm, then he can mess with the commitments and win the complaints.
Regarding trusted oracle case, it also can work but we need to have one and answer if we really trust it (also, isn't it a centralisation ;).
We also could have even simpler approach when our trusted oracle would be "the prime" member of the group. For example, we could have a group selection algorithm, that would pick the most trusted peer (a primary peer) that could generate h and broadcast it to the other peers. But for this case we still need to answer if we trust "the prime".

madxor commented 6 years ago

Noteworthy is the fact that the coin-flipping protocol was created as a way to solve a particular problem, of two people playing a game of coin-flipping through a telephone. It was designed to decrease the possibility of cheating for that particular case.

IMHO it was not intended as a way to generate a long-bit random numbers in a multiparty environment. Theoretically, it has some interesting properties but those are relevant for multi round coin-flips and not really apropos our use case scenario.

madxor commented 6 years ago

Closing this because. we are modifying our DKG implementation to use elliptic curves, where the g parameter is defined by a curve. Therefore, only h parameter needs to be generated and this can be efficiently done by generating a point using a hash as an input (hash-to-point(previous-result).

Relevant flowdoc discussion: https://www.flowdock.com/app/cardforcoin/tech/threads/OFTK4tUpBT9MtkEpsOHqCmz7zbE

keep-network / keep-core

Coin flipping protocol - Pedersen VSS setup #372