dedis / kyber

Advanced crypto library for the Go language
Other
628 stars 170 forks source link

Bug: share/dkg/rabin: All N participants must be available #472

Closed vibs29 closed 1 year ago

vibs29 commented 2 years ago

Hi. I'm having trouble generating a distributed key when some (allowed number of) participants is offline. Is my expectation wrong that some number are allowed to be offline? Otherwise, either I'm using the library incorrectly or it has a bug.

I've produced a small test file, threshold_test.go, that demonstrates the problem. It can simulate 1/7 participants being offline. It needs to be added to the share/dkg/rabin directory, from where you can then do "go test". If you set the index BAD to -1 (simulating all participants being online), you'll see the tests all pass. But if you set it to 1 (simulating the participant at that index being offline), then TestDKGSecretCommits2() fails, whereas I still expect it to pass. Shouldn't it pass even with BAD=1?

// share/dkg/rabin/threshold_test.go
package dkg

import (
    "testing"
    "github.com/stretchr/testify/assert"
    "github.com/stretchr/testify/require"
)

const GOOD int = 0
const BAD int = 1 /* index of dkg to simulate as offline, -1 will disable */

func TestDKGSecretCommits2(t *testing.T) {
    fullExchange2(t)
    dkg := dkgs[GOOD]
    _, err := dkg.SecretCommits()
    assert.Nil(t, err) //"dkg: can't give SecretCommits if deal not certified"
}

func fullExchange2(t *testing.T) {
    dkgs = dkgGen()
    // full secret sharing exchange
    // 1. broadcast deals
    resps := make([]*Response, 0, nbParticipants*nbParticipants)
    for i, dkg := range dkgs {
        if i != BAD {
            deals, err := dkg.Deals()
            require.Nil(t, err)
            for j, d := range deals {
                if j != BAD {
                    resp, err := dkgs[j].ProcessDeal(d)
                    require.Nil(t, err)
                    require.Equal(t, true, resp.Response.Approved)
                    resps = append(resps, resp)
                }
            }
        }
    }
    // 2. Broadcast responses
    for _, resp := range resps {
        for i, dkg := range dkgs {
            // ignore all messages from ourself
            if resp.Response.Index == dkg.index {
                continue
            }
            if i != BAD {
                j, err := dkg.ProcessResponse(resp)
                require.Nil(t, err)
                require.Nil(t, j)
            }
        }
    }
    // 3. make sure everyone has the same QUAL set
    for i, dkg := range dkgs {
        if i != BAD {
            dkg.SetTimeout() //line below fails otherwise
            require.True(t, dkg.Certified())
            for j, dkg2 := range dkgs {
                if j != BAD {
                    require.True(t, dkg.isInQUAL(dkg2.index))
                }
            }
        }
    }
}
vibs29 commented 2 years ago

Adding one line of code makes the test pass. But I don't know if the line makes sense and keeps the protocol secure. Does it?

// share/dkg/rabin/dkg.go
// added first line to method body

func (d *DistKeyGenerator) SetTimeout() {
    (*d).dealer.SetTimeout()
    for _, v := range d.verifiers {
        v.SetTimeout()
    }
}
vibs29 commented 2 years ago

I've created pull request 473 containing the test and code above.

pierluca commented 1 year ago

@si-co what do you think about this? My understanding is that this behaviour is by design, as the DKG requires all participants to be online in the generation phase so that we guarantee that the DKG decryption works even in the absence of n-t nodes. If my understanding is correct, we can close the issue.

vibs29 commented 1 year ago

I'd like to make one correction to what I said. I'd said that my fix allowed n-t participants to be offline. What I should have said was that it allowed t-1 participants to be offline. That's the number that the threshold scheme allows to be offline (or more generally, to be attackers).

si-co commented 1 year ago

I agree with @pierluca: the number of participants in the generation phase defines the variable n, which in turn impacts the number of nodes needed for the decryption. IMO the behavior is correct and nodes skipping the generation phase should be handled at an higher level.