Are all lines necessary in Shuffle?

lschoe / mpyc

MPyC: Multiparty Computation in Python

MIT License

379 stars 77 forks source link

Are all lines necessary in Shuffle? #79

Closed faykurt closed 11 months ago

faykurt commented 11 months ago

Hi, below part is from shuffle function. Are all the lines necessary or is it just to increase the randomness? x_u is already random. What is the purpose of scalar_mul and vector_add? Can't we get rid of them to increase the performance of shuffling by decreasing the execution time?

Also could you name the paper, study that shuffling algorithm originated?

    for i in range(n-1):
        u = random_unit_vector(sectype, n - i)
        x_u = runtime.in_prod(x[i:], u)
        d = runtime.scalar_mul(x[i] - x_u, u)
        x[i] = x_u
        x[i:] = runtime.vector_add(x[i:], d)
    return

lschoe commented 11 months ago

You can take a look at SecretSantaExplained.ipynb for an explanation of this shuffling algorithm. The algorithm generates each permutation with probability $1/n!$. If you stop earlier, e.g, if you replace for i in range(n-1) by for i in range(n-2) then you only get half of the permutations, each with probability $2/n!$.

faykurt commented 11 months ago

Dear @lschoe, for a very big n, 1/n! and 2/n! have no difference, they are both small enough. Can we optimize for i in range(n-1) part? Instead of 1, we may put a variable that relates with n size?

lschoe commented 11 months ago

Well, $n$ need not be very large here at all, and certain permutations are excluded with certainty. Also, dropping a few steps at the end does not lower the overall complexity very much, as these later steps require an entropy of only $\log_2 (n-i)$ secure random bits for $i$ close to $n$.

MarcT0K commented 11 months ago

@faykurt If you want to speed up the process, I implemented a parallelized version of the shuffle function (see PR #68).

On a private repo, I also implemented the Laur et al. shuffle. Its complexity is linear contrary to the current MPyC shuffle that is quadratic. However, Laur et al.'s shuffle has a much worse scaling in the number of parties.

faykurt commented 11 months ago

Thanks for your all replies, very appreciate and one more question :) Why did you implement Knuth shuffle instead of Sattolo shuffle?

lschoe commented 11 months ago

Sattolo's shuffle is not a uniform permutation, but a uniform cyclic permutation, which is something else.