WalletWasabi / WabiSabi

MIT License
105 stars 28 forks source link

Deciding on the applied Blind Signature Scheme #2

Closed seresistvanandras closed 4 years ago

seresistvanandras commented 4 years ago

Essentially what we are trying to achieve boils down to a single question: what is the appropriate blind signature in our specific application scenario which supports these two added functionalities:

  1. After blinding phase users should be able to prove that the sum of the committed messages equal a public value. This corresponds to UTXO splitting in the input registration phase.
  2. At reblinding phase, users should be able to prove, again, that the sum of their values "inside" valid signatures add up to a public value. During our conversations we referred to this as UTXO merging.

I claim that BLS blind signatures with a little tweak can achieve what we aim for. For sake of self-containedness I enclose the definition of BLS blind signatures.

Let $e: G\times G\rightarrow G_T$ be a pairing and $G$ , $G_{T}$ groups of prime order $p$ and $H:\{0,1\}^* \rightarrow G$ a hash function and let $g$ be a generator of $G$ (I use multiplicative notation and describe here the BLS Signature for symmetric pairings, but they can also be defined for asymmetric Type 2 and Type 3 pairings, see for instance here).

In order to generate a BLS-signature, the signer with private key $x\in Z_p^*$ ($y=g^x$ is the public key) computes $\sigma=H(m)^x$ and the verifier checks whether $e(H(m),y)=e(\sigma,g)$%3De(%5Csigma%2Cg)%24) holds. The correctness is obvious, since $e(H(m),y)=e(H(m),g^x)=e(H(m),g)^x = e(H(m)^x,g) = e(\sigma,g)$%3De(H(m)%2Cg%5Ex)%3De(H(m)%2Cg)%5Ex%20%3D%20e(H(m)%5Ex%2Cg)%20%3D%20e(%5Csigma%2Cg)%24).

Now the blind signature version is as follows:

It is easy to verify that this is correct, since $\overline{\sigma}y^{-r} = (H(m)g^r)^xy^{-r}=H(m)^xg^{rx}g^{-rx}=H(m)^x=\sigma$%5Exy%5E%7B-r%7D%3DH(m)%5Exg%5E%7Brx%7Dg%5E%7B-rx%7D%3DH(m)%5Ex%3D%5Csigma%24).

How shall we instantiate the hash function? In our case the hash function might be modular exponentiation, namely $H(m)=h^m \mod p$. However, this is not a second-preimage resistant hash function as $H(m)=H(m+p-1)$ by applying Fermat's little theorem. This can be fixed, if we restrict the message space to be $m\in\mathbb{Z}_p$. This can be enforced by using some range proof system.

After settling on the blind signature scheme and the applied hash function how can we prove our desired properties?

  1. Suppose a user registers a public input UTXO $m$ and wants to split it into $m_1, m_2$ but without telling the server neither $m_1$, nor $m_2$. If the hash function is modular exponentiation, then one can look at the blinded message as a Pedersen commitment of the message. For that we know how to check simple relations due to the homomorphicity of the commitment scheme. If a user has $\overline{m_1}=H(m_1)g^{r_1}=h^{m_1}g^{r_1}$ and $\overline{m_2}=H(m_2)g^{r_2}=h^{m_2}g^{r_2}$, then can easily convince the server that $m=m_1+m_2$ by opening the product of the two commitments, ie. $\overline{m_1}\overline{m_2}=h^{m_1+m_2}g^{r_1+r_2}$, by sending $r_1+r_2$ to the signer. This naturally generalizes more than two splitted UTXOs.

  2. Similarly a user could prove that two values "in a signature" add up to a public value (UTXO merging) by this verification equality: $\sigma_1\sigma_2=h^{m_{1}x}h^{m_{2}x}=h^{x(m_1+m_2)}$.

In summary, I think the real question is whether we want/find better blind signatures supporting our needs or not.

nopara73 commented 4 years ago

Shouldn't also be requirement for the homomorphic commitment scheme be that rangeproofs can be applied to it, or it's always given?

It's somewhat outside the scope of this conversation, but resolving the double spending issue is also a requirement, but that may be possible to resolve later.


Now that I got to the end, isn't both of my issue got resolved, even the double spend? Since the blind sigs can only be used once we can verify that signatures aren't reused when we prove the sum!

seresistvanandras commented 4 years ago

Yeah, indeed! 120% agree! These are two crucial additional requirements.

  1. We need to prevent double-spending!
  2. The commitment scheme should also support some efficient range proof system!
nothingmuch commented 4 years ago

small typo: i think \sigma' is supposed to be \overline{\sigma} (also TIL about this hack for math in github, and... wat. why like this github?!)

this seems like a simpler blind signature scheme than randomizable signature scheme we discussed.

however, i do think double spending is still a problem:

  1. Alice obtains signatures for m_1, m_2, m_3 (by registering one or more inputs, doesn't matter)
  2. Alice registers outputs for up to 6 different signatures, each with an unlinkable blinding factor
    • \sigma_1\sigma_2, r_1 + r_2 for m_1 + m_2
    • \sigma_1\sigma_3, r_1 + r_3 for m_1 + m_3
    • \sigma_2\sigma_3, r_2 + r_3 for m_1 + m_2
    • \sigma_1, r_1 for m_1
    • \sigma_2, r_2 for m_2
    • \sigma_3, r_3 for m_3

so i think we need some sort of accumulator for the blinding factors, and to prove that each sum is due to a combination of previously unused blinding factors, similar to zcash nullifiers.

as for range proofs - since these are Pedersen commitments isn't that exactly the same as what's used in confidential transactions?

nothingmuch commented 4 years ago

sorry, that's not correct, since if alice reveals all the r values the coordinator can subtract them to identify this behaviour. but alice can just generate 6 more commitments to value 0 and add them to each registration.

nopara73 commented 4 years ago

@nothingmuch just to confirm. Your conclusion is that double spending is prevented this way, too, right?

nothingmuch commented 4 years ago

no, not without accumulators (but that should be a solved problem, zcash accomplishes it in a more complicated setting, and 2 of the divisible e-cash papers we shortlisted for review are based on accumulators and are likely to be a bit simpler than how zcash does it)

edit: i jumped to conclusion there, obviously accumulators are not the only way

seresistvanandras commented 4 years ago

small typo: i think \sigma' is supposed to be \overline{\sigma}

Indeed. Thanks Yuval! Should be fixed now!

however, i do think double spending is still a problem:

  1. Alice obtains signatures for m_1, m_2, m_3 (by registering one or more inputs, doesn't matter)
  2. Alice registers outputs for up to 6 different signatures, each with an unlinkable blinding factor
  • \sigma_1\sigma_2, r_1 + r_2 for m_1 + m_2
  • \sigma_1\sigma_3, r_1 + r_3 for m_1 + m_3
  • \sigma_2\sigma_3, r_2 + r_3 for m_1 + m_2
  • \sigma_1, r_1 for m_1
  • \sigma_2, r_2 for m_2
  • \sigma_3, r_3 for m_3

This is a great and important observation! An important attack vector, we need to mitigate! I show below an easy fix!

so i think we need some sort of accumulator for the blinding factors, and to prove that each sum is due to a combination of previously unused blinding factors, similar to zcash nullifiers.

I think we do not even need that heavy cryptographic machinery, like accumuators. Don't forget that Simplicity is the ultimate design goal! Just a quick reminder that $g$ and $h$ are generators in the used elliptic curve group $\mathbb{G}$ and only the coordinator knows the discrete logarithm between $g$ and $h$, in other words they don't know $x_i \lor x_j$ such that $g^{x_i}=h \lor h^{x_j}=g$. Maybe neither coordinator knows, the only important thing is that mixing participants should not know, otherwise they could cheat. The secret key of the coordinator is $x$, while the public key of the coordinator is $g^x$. Whenever a user would like to merge two UTXOs the user encloses the hashes of the UTXO values and coordinator's signatures on them: $(H(m_1),\sigma(m_1),H(m_2),\sigma(m_2))=(h^{m_1},h^{m_{1}x},h^{m_2},h^{m_{2}x})$%3D(h%5E%7Bm1%7D%2Ch%5E%7Bm%7B1%7Dx%7D%2Ch%5E%7Bm2%7D%2Ch%5E%7Bm%7B2%7Dx%7D)%24). Since $m_1+m_2$ is public, the coordinator can check whether $h^{x(m_1+m_2)}=h^{xm_1}h^{xm_2}=\sigma(m_1)\sigma(m_2)$. Coordinator accepts each signatures only once, therefore double-spending is effectively and elegantly solved. Note that coordinator does not learn the underlying messages, coordinator only learns that these are valid signatures on the hashes of SOME message. However small message space might cause problems...

Still, we have other difficulties arising from the small message space, ie. coordinator knows that m is less than, say 10BTC, which amounts to $10^9$ Satoshi. Hence when the coordinator sees the hash of some message $m$, i.e. $H(m)$, it could figure out simply the underlying message just by bruteforcing the exponent, put differently, solving the DLOG with a simple and efficient brute-force due to the constrained message space. Maybe this can be solved with some clever padding scheme...should think about this more!

as for range proofs - since these are Pedersen commitments isn't that exactly the same as what's used in confidential transactions?

They're the same, so it should not be a problem! :) We'll surely find appropriate range proof systems if we decide to use BLS with the modular exponentiation hash function

nopara73 commented 4 years ago

Maybe this can be solved with some clever padding scheme...

Plan ZS could be to add a bunch of random sub-satoshi amounts, so the message space would be sufficiently large. In that case there can be number of commitments * 1 satoshi error in the final sum, so we have to make sure it's handled properly in software.

seresistvanandras commented 4 years ago

This is superb! You are right! I had a way more complicated solution in mind but yours is the most elegant and simplest we can come up with, I think!

nopara73 commented 4 years ago

Oh, actually just make the 0.1 and 0.01 satoshi decimal places constant zeros and then the software won't even need to care about it anymore.

nothingmuch commented 4 years ago

something to consider that i thought I'd mention since it's now buried in the slack log:

suppose we align the amount so that the bottom ~128 bits of the m are completely random and never represent a quantity more than say a millisatoshi (to prevent them summing to more than 1 sat), this means the top bits of m would be strongly biased.

this bias in the field elements might mean that the coordinator could use e.g. lattice based attacks to uncover the amounts

nopara73 commented 4 years ago

Summary of this conversation: https://github.com/nopara73/Yohohoho/issues/5

Closing this issue, @nothingmuch's attack is addressed there.