Open themighty1 opened 3 years ago
Yes. And maybe I should explain this optimization first.
So, in our work, we try to avoid running the GCM function inside garbled circuits. Instead, what we do is the following: recall that applying the GCM to a ciphertext is an inner product. If the N parties in the secure computation secret-share the GCM sequence, then each of them can compute a share of the inner product outside the secure computation, if they know the ciphertext. They can then combine this result in secure computation.
The circuits here are used to generate the GCM sequences, which can later be secret-shared among the N parties, to enable the optimization above.
Why do we only focus on 10? Because computing this sequence itself is expensive, and we are not actually sending a lot of data in our application. So, as a result, we choose somewhere between 5-7, and when we are sending a long message (in our application, an email body in SMTP), we chop the message into smaller TLS packets, so 5-7 of the GCM sequence suffices.
We explain how to choose this number in page 7 https://eprint.iacr.org/2021/342.pdf on "Efficient GCM tag computation".
Now comes the question: what if I want to send a packet with more than 10 blocks worth of messages? In our work, we avoid so by just chopping the TLS packet into smaller ones (remember that this is always allowed, as the underlying TCP is a flow, so you can break the packet anywhere you want). However, if one application sincerely wants to send a longer packet in one shot (note: in practice, this is not needed), then a larger GCM circuit is needed.
It needs to be "synthesized", but does not need to go through the VHDL file. Instead, one can simply use this function: https://github.com/n-for-1-auth/circuits/blob/main/generator/generate_handshake_circuits.cpp#L1026
which computes H, H^2, ... as in the desired GCM sequence for the multiparty computation of the authentication tag.
This is a great answer, thank you very much!!! I thought about testing sending smaller TLS records. For some reason I thought that servers might reject records <512 bytes. But you've confirmed that it worked for you. Thanks for your research.
Note that this optimization is sort of restricted to our work: as our work is only sending the emails, we do not decrypt any result from the server. In other words, regardless of what the server says, we just keep sending the necessary email components.
This seems to be the complete opposite of TLS notary, in which the server response might (or might not) need to be decrypted.
If that is the case, then the protocol has to tolerate and decrypt whatever the server sends. In this case, if the server prefers sending a giant packet (so it needs a long GCM sequence), then our optimization does not work.
Alternatives exist. Maybe the TLS notary only notarizes the ciphertext and some key related materials, and the client is required to use zero-knowledge proofs to open up the data.
In case you are looking for efficient zero-knowledge proofs, I would suggest https://github.com/emp-toolkit/emp-zk This would be a longer discussion---if you are generating a proof just for a few verifiers, you don't use SNARK, as they are too slow to prove. Instead, one may leverage an interactive proof system where you prove to one another person at a time (each of them would be the page signer, or the notary). They tend to be orders of magnitudes faster.
Sometimes, when we are proving local computation on some data that one party knows (or is allowed to know), zero-knowledge suffices.
Thank you for the interesting discussion.
You are right, this optimization will not work for the server response which may be up to 16KB -> 1000 blocks -> 15M AND gate circuit for 1000 gcm powers.
For such cases with huge responses, I think an alternative approach to calculating gcm powers would be to use Paillier cryptosystem, so that prover/verifier end up with additive shares. Have you considered computing gcm shares in Paillier, Would it be compatible when multiple parties are involved?
Though Paillier gives you additive shares, it seems that computation is not easy---to do multiplication in Paillier, we need a lot of interactions. (But it might still be efficient enough.)
One solution is to do the SPDZ multiparty computation protocol in the field GF(2^{128}), which enables fast computation of GCM sequence.
@weikengchen , interesting suggestions, thanks. Could you take a look at this Paillier protocol to compute gcm shares?
Let's say Alice and Bob start out with additive shares of H (where H is gcm's encrypted zero). They want to compute powers of H (mod ghash polynomial) using Paillier.
E(Ha+Hb) == E(H)
E(H*M)
H*M
by adding another mask M2. Bob gets E(H*M+M2)
, sends it to Alice. Note that M2 must be a multiple of ghash polynomial (P). H*M+M2
. She reduces it mod P. The result is H*M mod P
.
Alice computes powers of H*M mod P
and sends those powers encrypted to Bob: E(H^2*M^2)
, E(H^3*M^3)
etc.E(H^2)
, E(H^3)
. Bob then subtracts his final mask S and sends E(H^2-S)
to Alice.S
and alice has H^2 - S
, which are additive shares of H^2.Do you think this can also work in MPC setting?
(I have only skimmed the protocol above. Sorry that what follows may not be correct.)
Paillier-based solution has one thing regarding the trust assumption, as it often requires both parties to be semi-honest. Today, it is better to avoid using homomorphic encryption to build a protocol when you need malicious security.
A malicious attacker may not follow the protocol, and there could be many ways to break it. Or in other words, it is very hard to prove that it is maliciously secure, which may not be desirable for some applications.
And back to the protocol. There it seem to have many small points beyond the difficulty to achieve malicious security.
It requires careful workout of the choices of M and M2 and so on: for example, some care is needed in choosing M, as a zero mask does not have an inverse in GF(2^{128}) (here, there is an order requirement for M), and adding M2 requires increasing the dimensions of the plaintext, and you need to add a lot of M2 for the security to work through. A lot of care is needed there as well.
The last step, which secret-shares H^2 between the two parties, is actually slightly complicated. The secret-shared field, which would be over the Paillier modulus n=pq, is not the field where we compute the GCM authentication tag (which is GF(2^{128}), so this secret-sharing may not be as useful.
And back to the standard approach using MPC.
I recommend standard approaches, as achieving malicious security is hard, but you get them directly by using the standard approaches. If you simply run a SPDZ MPC over GF(2^{128}) for the exactly same computation, you can be sure that it is secure. But it does have some implementation overhead, as the offline precomputation of SPDZ is hard to write.
In MPC, it would be easy. Let A provide the share as input to MPC. Let B provide the share as input to MPC. In MPC, perform repeated multiplication. Then in MPC, sample random elements to secret-share the results between the two parties.
Hi, thanks for making these circuits available. There are 10 powers of gcm circuits https://github.com/n-for-1-auth/circuits/tree/main/gcm Is my understanding correct that those can be used only to compute ghash for up to 10 AES blocks? For ciphertext larger than 10 blocks one has to synthesize a bigger circuit?