Question: understanding fixed-point division

tshead2 commented 4 years ago

I'm wondering if you get give me a high-level description of what the following code is doing in crypten/mpc/primitives/arithmetic.py:324 ... I'm especially confused about why it only applies when there are 3-or-more players?

Thanks in advance, Tim

        if isinstance(y, int) or is_int_tensor(y):
            # Truncate protocol for dividing by public integers:
            if comm.get().get_world_size() > 2:
                wraps = self.wraps()
                self.share /= y
                # NOTE: The multiplication here must be split into two parts
                # to avoid long out-of-bounds when y <= 2 since (2 ** 63) is
                # larger than the largest long integer.
                self -= wraps * 4 * (int(2 ** 62) // y)
            else:
                self.share /= y
            return self

knottb commented 4 years ago

This is a workaround for truncation in the 3+ party case, and a full explanation requires some background knowledge and math:

The reason this is needed is explained in section 5.1 of [1].

The solution is explained in appendix C of [2].

================================================== In the appendix, we skip over the proof of correctness of the wraps function, since it is a bit more involved, but I can explain it here. We generalized some of the notation from [3] to define the following:

$L$ is the modulus used for secret sharing (in our case $2^{64}$)

$\theta_x$ is the wraps count for a variable $x$ defined by: $x = \sum_i x_i + \theta_x L$

$\beta_{x,r}$ is the differential wraps for variables $x$ and $r$ defined by: Suppose $[z] = [x] + [r]$, then $z_i = x_i + ri - {beta{x,r}}_i L$

$\eta{x,r}$ is the plaintext differential wraps for variables $x$ and $r$ defined by: $z = x + r - \eta{x,r} L$

The complete process we use to compute the wraps $[\theta_x]$ of a secret shared variable $[x]$ is done here and follows Algorithm 3 of [2].

A TTP generates a random ring element $[r]$ and its wraps $[theta_r]$ offline. We can then compute and reveal $[z] = [x] + [r]$ without exposing information about [x] since [r] is random. It can be shown from the equations above that:

$[\theta_x] = \thetaz + [\beta{x,r}] - [\thetar] - [\eta{x,r}]$

Each of these values can be computed from:

$\theta_z$: During the reveal of $z$ we can also compute $\theta_z$ from shares of $z$.
$[\beta{x,r}]$: Each party can locally compute its own share of $\beta_{x,r}$ independently
$[\theta_r]$ is given
$[\eta{x,r}]$ - We explain in the appendix that we skip computing $\eta{x,r}$ since it has very high probability of being 0. $\eta_{x,r} = 0$ when $-2^{63} \le x + r \le 2^{63} - 1$ which produces an error with probability $\frac{|x|}{L}$. This is the same probability of getting an error in the 2 party case using simple division.

Note that in both of these cases, we can reduce the probability of error be decreasing the magnitude of $x$ or increasing the size of our ring $L$. Additionally, since $x$ is usually scaled by our FixedPointEncoder, we can also reduce this probability by decreasing our fixed point precision.

================================================== There are a few methods for performing this computation used in practice, each with its own tradeoffs:

Our method will produce errors (off by exactly $L / B$) with probability $|x| / L$
The method of [1] requires Replicated secret-sharing which does not hold under dishonest majority assumptions since it uses k-out-of-n secret sharing, and requiring k choose n computations to perform each multiply.
The method of [3] does not hold under dishonest majority assumptions since it requires asymmetric computation (where the 3rd party acts as an online TTP for computatoins between party 1 and party 2).
The method of Section 3.4 of [4] produces errors that are only as large as the quantization error, but requires that we operate within a field (prime modulus), and requires computation using binary circuits.

We chose to implement solution 1 since it has the simplest implementation and is very performant since it does not require any bit decompositions. Different system models may benefit from each solution differently.

================================================== [1] Payman Mohassel and Peter Rindal. ABY3: A Mixed Protocol Framework for Machine Learning. [2] Awni Hanun, Brian Knott, Shubho Sengupta, and Laurens van Der Maaten. Privacy-Preserving Multi-Party Contextual Bandits. [3] Sameer Wagh, Divya Gupta, and Nishanth Chandran. SecureNN: Efficient and private neural network training. [4] Catrina, Octavian & Saxena, Amitabh. (2010). Secure Computation with Fixed-Point Numbers.

tshead2 commented 4 years ago

Thanks for the quick and wonderfully detailed response!

Cheers, Tim

facebookresearch / CrypTen

Question: understanding fixed-point division #103