HarryR / ethsnarks

A toolkit for viable zk-SNARKS on Ethereum, Web, Mobile and Desktop
GNU Lesser General Public License v3.0
240 stars 57 forks source link

Implement MiMCsponge #131

Open HarryR opened 5 years ago

HarryR commented 5 years ago

As per: https://gist.github.com/HarryR/a142d8ea442be7c05bf6c5edd0d8c488

Kobi has implemented EVM contract, JS library and constraints for 'MiMCsponge', which uses MiMC in Feistel mode with a sponge-like construct.

knarz commented 5 years ago

Shouldn't—according to the paper—the first and the last round constant be 0 and the swap operation be forgone in the last round?

HarryR commented 5 years ago

@kobigurk what are your thoughts?

I am using 0 as the first round constant, but not for the last round constant, see: https://gist.github.com/HarryR/a142d8ea442be7c05bf6c5edd0d8c488#file-mimcsponge-py-L55

I don't see any justification in the paper specifically for not using a round constant in the first and last rounds, nor do I see justification for not performing the swap in the last round. Can you elaborate or find some references specifically about generic Feistel construction which can offer some insight (e.g. re: not swapping in the last round)?

There is a followup paper with many of the same authors discussing MiMC: Feistel Structures for MPC, and More, Extended Version - in this paper they don't seem to make a distinction for the first or last rounds regarding the round constant or swap operation.

There is reference code for the Feistel variant: https://github.com/byt3bit/mimc-feistel_snark

But there is no reference code for the sponge construct.


Here is page 5 §2.1 from https://eprint.iacr.org/2016/492.pdf for reference

c

HarryR commented 5 years ago

@byt3bit Do you have any input on this kind of thing?

byt3bit commented 5 years ago

@HarryR In MiMC( EM mode as in the figure above) , the round constant at the first round can easily be combined with the plaintext/input. So from the security perspective it does not give any advantage. The same can be said about the last round constant. In Feistel mode we maintained the same design choice.
In G(eneralised)MiMC also it’s the same for the first and last round. The description on the eprint article does not specify this explicitly. Thanks for pointing this out, I will put it there. Making the (first and last) round constants non-zero in Feistel mode doesn’t change the security. Regarding the swap, as in any Feistel structure the last round swapping has no effect in security.

I am currently making changes to the implementation of GMiMCHash in sponge mode. So it's not there at the moment.

knarz commented 5 years ago

Thanks for the reply, @byt3bit

I did some digging around and found the following in [1]—just to add some more context:

A key addition layer is applied before the first round. The motivation for this initial key addition is the following. Any layer after the last key addition in the cipher can be simply peeled off without knowledge of the key and therefore does not contribute to the security of the cipher (e.g. the initial and final permutation in the DES). On the other side, in order to make the cipher and its inverse more similar in structure, the linear mixing layer of the last round is different from the mixing layer in the other rounds. It can be shown that this does not improve or reduce the security of the cipher in any way. This is similar to the absence of the swap operation in the last round of the DES.

[1]: Sanchez-Avila, C., and R. Sanchez-Reillol. "The Rijndael block cipher (AES proposal): a comparison with DES." Proceedings IEEE 35th Annual 2001 International Carnahan Conference on Security Technology (Cat. No. 01CH37186). IEEE, 2001.

byt3bit commented 5 years ago

@knarz The "peeling off" is the same thing which I described as combining with the input/output for the first/last round.

kobigurk commented 5 years ago

@HarryR I don't mind changing it to match the description, though it seems that it might not be necessary. @byt3bit By "not doing the swap in the last round", it means that xL and xR are reversed?

HarryR commented 5 years ago

An example of this would be:

last_round = (previous_round + C_i + k) ** 3

Where knowing C_i and 3 it serves no purpose, as you can find the cube root of the output then subtract the known round constant, to get the output of the previous round.

This only seems relevant for encryption, where it reduces the round count by one, but with hashing where we know all parameters we're relying on a different argument about the total number of rounds - or even which parameters can be controlled by an attacker to find a preimage.


There are two other similar variants of the MiMC permutation being used as the core of a compression function, in a way which isn't using the sponge construct:

1) https://github.com/HarryR/ethsnarks/blob/master/ethsnarks/mimc.py#L126 2) https://github.com/iden3/circomlib/blob/master/circuits/mimc.circom

The first uses the MiMC permutation together with Miyaguchi–Preneel (where the input and previous output is mixed together with the result of the permutation, this is then used as the key for the next round. The second just uses the output of the previous permutation as the key for the next round, without mixing the previous output and message into the result.

This is possibly the cheapest variant of a compression function we can find which is compatible with zkSNARKs, but there are still many questions about the overall security of this approach, specifically regarding pre-image and collision resistance.


Another question somebody asked me, is using separate round constants to create different domains. Or does a separate key with the same set of round constants essentially create a different domain?

e.g. the work of interpolation being re-used for the same domain with a different key, versus the work of interpolation required for each independent domain.

byt3bit commented 5 years ago

@kobigurk yes, if xL and xR denote the left and right output after swapping

HarryR commented 5 years ago

@byt3bit What would be really helpful is your gut feel...

MiMC + sponge, vs MiMC + one-way-compression-function

If you had to bet a million dollars on one, which would it be...

No pressure ;)

kobigurk commented 5 years ago

@HarryR I updated my implementation to act like the paper says: https://github.com/kobigurk/circomlib/pull/2

Also, thanks for raising the burning question to @byt3bit :) From my side, I wouldn't discount Miyaguchi-Preneel be OK, as MiMC is a cipher, though I really want to see an expert opinion before deviating from the literature.

byt3bit commented 4 years ago

@HarryR So far I have not found any security issues with the fixed key version of MiMC or Feistel MiMC. I am not aware of any analysis which shows any vulnerability. Based on that, I would say it can be securely used in a mode like Miyaguchi-Preneel. There is a recent paper (at SAC 2019, eprint version https://eprint.iacr.org/2019/812 ) which analyses the degree requirement of the polynomial corresponding to MiMC.

byt3bit commented 4 years ago

@HarryR, @kobigurk There is new analysis of Feistel-MiMC which may or may not have an effect on the mode. My updated comment is that, I have to check this further to confirm the security.

HarryR commented 4 years ago

This is good news, but I still need to re-read and fully digest the 2019/812 paper.

There are other questions which are still outstanding, such as whether or not MiMC (in any mode) can be used wherever a random oracle is desired, for example to hash h=H(R,A,M) in the EdDSA protocol, where I was under the understanding that it would be sufficient to use something which only has strong collision resistance properties.

kobigurk commented 4 years ago

Harry,

Regarding that, we had a twitter thread about EdDSA yesterday where it was shown by Daira that you do need a random oracle (unless you're using the Generic Group model). Do you have a reference for requiring only collision resistance?

On Mon, Aug 19, 2019, 17:28 HaRold notifications@github.com wrote:

This is good news, but I still need to re-read and fully digest the 2019/812 paper.

There are other questions which are still outstanding, such as whether or not MiMC (in any mode) can be used wherever a random oracle is desired, for example to hash h=H(R,A,M) in the EdDSA protocol, where I was under the understanding that it would be sufficient to use something which only has strong collision resistance properties.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HarryR/ethsnarks/issues/131?email_source=notifications&email_token=AA23MGHF6OBHFQG7RHOTYZDQFKUYLA5CNFSM4H4DKZ52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4TD5GA#issuecomment-522600088, or mute the thread https://github.com/notifications/unsubscribe-auth/AA23MGCN225QZE7NNYWHBQDQFKUYLANCNFSM4H4DKZ5Q .

HarryR commented 4 years ago

For reference:

I'm failing to see any solid argument put-forth by Daira about ROM vs TCR, nor any insight into intuition, other than an assertion that it is fact.

From the paper Hash Function Requirements for Schnorr Signatures, the hash function needs two properties:

There is also another potential problem with using MiMC as the H function, the order of the curve is larger than the field, but the field is larger (by about 3 bits) than the order of the curve divided by the cofactor. So there is an injective mapping between the output of H and points on the curve, but a surjective mapping after removing the cofactor.

The EdDSA verification algorithm being used is:

def Verify(A, R, m, s):
    h = H(R,A,m)
    S = G*s
    assert S - (A*h) == R
    return S == R + (A*h)

This differs from the algorithm describes in section 3 of the paper 'The Schnorr Signature Scheme', where their verification scheme doesn't seem to work. You can recover R from G*s, but you need R == G*s - A*h, rather than their stated g^s · pk^{-h}. Additionally the paper doesn't include the public key in the H(...) -> h calculation.

The two adversarial models outlined in the paper are:

  1. Find s and R such that R = G*s - A*h, with a fixed m
  2. Find s, R and m' with fixed s and R

Using MiMC with the Preneel OWF construct the second case is conceptually easier to understand as there are far fewer variables. The H function is in a known state prior to inputting m', the argument is that if you know values of h which satisfy the verification you must find the input which results in one of the h values... but having pre-determined values of h kinda assumes you've broken dlog for the curve and are now relying solely on preimage-resistance of the hash function to prevent it... which doesn't really make sense.

And... I'm spending too much time on this, but would like to get to the bottom of it, because I'm still not seeing strong arguments for ROM.

byt3bit commented 4 years ago

The cryptanalysis of Feistel-MiMC (and GMiMC) block cipher (https://eprint.iacr.org/2019/951) is due to the extremely simple key scheduling. There is a fix for this and it will be uploaded soon. The cryptanalysis does not apply to MiMC.

daira commented 4 years ago

I'm failing to see any solid argument put-forth by Daira about ROM vs TCR, nor any insight into intuition, other than an assertion that it is fact.

Hmm? The burden of proof that TCR is sufficient is on anyone who makes that claim. The Wikipedia page (before I fixed it) made the claim without any kind of support. The EdDSA papers certainly don't claim it. If you want evidence that it's false, it's probably possible to gerrymander a counterexample of a TCR hash function that is insecure with EdDSA; I'd start by trying a Pedersen hash based on the same elliptic curve. But I don't have time to do that, and as I said it's the wrong burden of proof.

The Pointcheval-Stern proof for Schnorr requires ROM, and the security argument for EdDSA has always been that it's as secure as Schnorr based on that proof.

I already explained in the Twitter thread why I don't think the model in the Neven paper makes sense.