vacp2p / research

Thinking in code
MIT License
62 stars 4 forks source link

Algebraic attack to n-RLN #155

Open s1fr0 opened 1 year ago

s1fr0 commented 1 year ago

DISCLAIMER: this issue may contain inaccurate or false statements. Its sole purpose is to investigate feasibility of possible attack ideas for n-RLN. It is WIP.

Introduction

In the Shamir's Secret Sharing (SSS) polynomial of n-RLN, all (secret) coefficients are computed from the same secret a0 (and other public values) using the Poseidon hash function.

In Poseidon, each round consists of an non-linear SBOX followed by a linear mix which can be modeled in a simplified manner as

$$x \rightarrow k{m,i}\cdot(x+k{r,i})^p$$

with $p=3,5$ and $k_m, k_r$ derived from round constants (in practice few other caveats exist, but we keep its formulation simple for ease of exposition).

It follows that by iterating the round function, we obtain the sequence of transformations

$$x_0 \rightarrow x1 = k{m,1}\cdot(x0+k{r,1})^p \rightarrow x2 = k{m,2}\cdot(x1+k{r,2})^p \rightarrow \ldots \rightarrow xn = k{m,n}\cdot(x{n-1}+k{r,n})^p$$

where $x_n$ corresponds to our final hash.

These iterations can be modeled with a set of multivariate polynomials. By working over the multivariate polynomial ring defined over $\mathbb{F}_p$ with variables $x_0,\ldots,x_n$, we define the polynomials

$$I = \begin{cases} k_{m,1}\cdot(x0+k{r,1})^p - x1 \\ k{m,2}\cdot(x1+k{r,2})^p - x2 \\ \ldots \\ k{m,n}\cdot(x{n-1}+k{r,n})^p - x_n\end{cases}$$

It follows that the ideal generated by $I \cup \{x_0 - \bar{v}\}$ contains the polynomial $x_n - Hash(\bar{v})$. The intuition behind this is that by adding the polynomial $x_0 - \bar{v}$ to $I$, we're assigning the value $\bar{v}$ to $x_0$ which iteratively defines all values of the variables $x_i$ in $I$ up to $x_n$.

Attacking 2-RLN

In the following, we take as toy example 2-RLN, that is Shamir's secret shares come from a degree-2 polynomial

$$y = \bar{a}_0 + \bar{a}_1\cdot x + \bar{a}_2\cdot x^2$$

where $\bar{a}_0$ represents the identity secret and $\bar{a}_1 = Hash(\bar{a}_0)$ and $\bar{a}_2 = Hash(\bar{a}_1)$ (in practice the hash is computed together with other public values, but we omit this for now).

The relation between the values $\bar{a}_0, \bar{a}_1, \bar{a}_2$ can be then modeled with the two sets of polynomials

$$I1 = \begin{cases} k{m,1}\cdot(a0+k{r,1})^p - x{1,1} \\ k{m,2}\cdot(x{1,1}+k{r,2})^p - x{1,2} \\ \ldots \\ k{m,n}\cdot(x{1,n-1}+k{r,n})^p - a_1\end{cases}$$

$$I2 = \begin{cases} k{m,1}\cdot(a1+k{r,1})^p - x{2,1} \\ k{m,2}\cdot(x{2,1}+k{r,2})^p - x{2,2} \\ \ldots \\ k{m,n}\cdot(x{2,n-1}+k{r,n})^p - a_2\end{cases}$$

over the polynomial ring $\mathbb{F}_p[a_0,a_1,a2,x{1,1},\ldots,x{1,n},x{2,1},\ldots,x_{2,n}]$.

From what said before, we have that the ideal generated by $I$ and the polynomial $a_0 - \bar{a}_0$, will contain the polynomials $a_1 - \bar{a}_1$ and $a_2 - \bar{a}_2$, since assigning a value to $a_0$ will iteratively assign a value to $a_1$ and $a_2$ according to the relations defined by $I$.

Note that $a_0$ denotes a polynomial variable, while $\bar{a}_0$ is a value, that is an element of the underlying field $\mathbb{F}_p$.

Modeling Shares

In order to recover the secret value $\bar{a}_0$ (and $\bar{a}_1,\bar{a}_2$), in 2-RLN we need 3 shares, i.e. 3 random $(x,y)$ pairs coming from evaluating

$$y = \bar{a}_0 + \bar{a}_1\cdot x + \bar{a}_2\cdot x^2$$

We assume we only know two random shares $(\bar{x}_1,\bar{y}_1)$, $(\bar{x}_2,\bar{y}_2)$ and we model their values using the polynomial variables $a_0,a_1,a_2$ as

$$S=\begin{cases}a_0 + a_1\cdot \bar{x}_1 + a_2\cdot \bar{x}_1^2 - \bar{y}_1\\ a_0 + a_1\cdot \bar{x}_2 + a_2\cdot \bar{x}_2^2 - \bar{y}_2\end{cases}$$

If we evaluate the polynomials in $S$ is the values $\bar{a}_0,\bar{a}_1,\bar{a}_2$ that were originally used to compute the shares, all of them will be equal to 0.

In other words, $(\bar{a}_0,\bar{a}_1,\bar{a}_2)$ is a solution to $S$.

Is there enough information?

Can we recover the secret value $\bar{a}_0$ by just knowing 2 shares out of 3? The theory behind Shamir's secret sharing ensures that it would be impossible (that is, all values from $\mathbb{F}$ are likely possible for $\bar{a}_0$), but it assumes the polynomial coefficients to be generated randomly, hence they are independent from each other.

In n-RLN, instead, such coefficients all depend on $\bar{a}_0$ through a non-linear relation given by the employed hash function. In other words, the unique solution to the polynomial system $\mathcal{P} = I_1 \cup I_2 \cup S$ (i.e. assignments to all variables so that all polynomials evaluate to 0) is defined exclusively by assigning the correct value $\bar{a_0}$ to the variable $a_0$.

This means that the unique solution to the system $\mathcal{P}$, i.e. a polynomial system in $2n+3$ variables, depends on a single variable assignment, that is $a_0$.

Informally, this means that the total entropy of the solution is just $\log{p}$ bits rather than $3\cdot \log{p}$, as it would be if $\bar{a}_0, \bar{a}_1, \bar{a}_2$ were computed randomly.

While the polynomials in $I_1$ and $I_2$ constrain only the coefficients without of the SSS polynomial, without revealing any bits on their actual assignments, the share polynomials in $S$, instead, do.

Indeed, since 3 shares would suffice to recover 3 independent coefficients in text-book Shamir's Secret Sharing, this means that each share reveals $\approx \log p$ of information about the secret coefficients all together.

The intuition behind the attack is that since the solution to the system $\mathcal{P}$ has just $\log p$ bits of entropy, then 2 random shares would provide -on average- enough information (in principle $\approx 2\log p$ when coefficients are independent) in order to be able to recover $\bar{a}_0$.

The way we attempt to recover $\bar{a}_0$, is by searching a solution to $\mathcal{P}$.

Finding a solution

The standard method to compute a solution to a multivariate polynomial system is by computing Gröbner basis.

We skip the technical details of such method, but we only mention that the ideal generated by $\mathcal{P}$ has minimum-degree generators of the form

$$B = \begin{cases}a_0 - \bar{a}_0\\a_1 - \bar{a}1\\\ldots\\x{2,n}-\bar{x}_{2,n}\end{cases}$$

where $\bar{a}_0,\ldots,\bar{x}_{2,n}$ are the assignments for $a_0,\ldots,x_{2,n}$ so that all polynomials in $\mathcal{P}$ evaluate to 0.

Such set $B$ results to be a Groebner basis for $\mathcal{P}$, hence finding it will automatically provide a solution to the system, including recovering the secret value $\bar{a}_0$.

There exists few valid algorithms to compute Groebner basis (Buchberger, F4, F5, etc.), however their computational complexity is hard to estimate and there is no guarantee that even simple systems could be solved quickly.

Attacking 2-RLN: PoC implementation

With the aim to make the above ideas a bit more concrete and ease discussion, we provide a Sage PoC implementation of the Groebner basis approach over a reduced-round version of an hash function that mocks Poseidon hash.

The following script can be tested online on Sagecell. The prime $p$ employed is the order of the curve BN254.

p = 21888242871839275222246405745257275088548364400416034343698204186575808495617;
F = GF(p);
F.<a0,x1_1,x1_2,x1_3,a1,x2_1,x2_2,x2_3,a2> = PolynomialRing(F)

# Reduced round hash
# Computing a1
r1_1 = 2*(a0  +1)^5 - x1_1;
r1_2 = 3*(x1_1+2)^5 - x1_2;
r1_3 = 4*(x1_2+3)^5 - x1_3;
r1_4 = 5*(x1_3+4)^5 - a1;

# Computing a2, i.e. a2 = Hash(a1)
r2_1 = 2*(a1 + 1)^5 - x2_1;
r2_2 = 3*(x2_1+2)^5 - x2_2;
r2_3 = 4*(x2_2+3)^5 - x2_3;
r2_4 = 5*(x2_3+4)^5 - a2;

# Those are example secret values
# can be computed with
#ideal(r1_1, r1_2, r1_3, r1_4, r2_1, r2_2, r2_3, r2_4, a0-1234567890123456789012345678901234567890123456789012345678901234567890123456).groebner_basis()
a0_val = F(1234567890123456789012345678901234567890123456789012345678901234567890123456);
a1_val = F(-5688860465330553566251445615169286501027840278850879586337975469649514457926)
a2_val = F(-13127261654781763675836365051266965574386061451819211794160656918246739109430)

# Shares computation
x_val = ZZ.random_element(0,p)
y_val = a0_val + a1_val*x_val + a2_val*x_val^2;
share1 = a0 + a1*x_val + a2*x_val^2 - y_val;

x_val = ZZ.random_element(0,p)
y_val = a0_val + a1_val*x_val + a2_val*x_val^2;
share2 = a0 + a1*x_val + a2*x_val^2 - y_val;

# Recover secret values
B = ideal(r1_1, r1_2, r1_3, r1_4, r2_1, r2_2, r2_3, r2_4, share1, share2).groebner_basis()

for e in B:
    print(e)

The script will return a base $B$ for $\mathcal{P}$ consisting of degree-1 polynomials (i.e. assignments to each variable) that will solve the whole system $\mathcal{P} = \{r{1,1},r{1,2},r{1,3},r{1,4},r{2,1},r{2,2},r{2,3},r{2,4}, share_1, share_2\}$.

Food for thought (take carefully!)