mit-plv / fiat-crypto

Cryptographic Primitive Code Generation by Fiat
http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP19.pdf
Other
720 stars 146 forks source link

Using Fiat-generated Bedrock2 functions. #921

Open RasmusHoldsbjergCSAU opened 3 years ago

RasmusHoldsbjergCSAU commented 3 years ago

Lately, I have been trying to use Bedrock2 functions for implementing elliptic curves.

We have worked out a simple example illustrating the functionality of the Bedrock2 framework that is needed for this, and the approach that we intend to follow.

We first synthesize Bedrock2 functions for a 381 prime, m. This is done here: https://github.com/RasmusHoldsbjergCSAU/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Examples/bls12prime.v

The example consists of a simple Bedrock2 function that does the following:

The function itself can be found here: https://github.com/RasmusHoldsbjergCSAU/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Examples/FunctionCallExample.v#L168

We prove that the function meets its specification here: https://github.com/RasmusHoldsbjergCSAU/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Examples/FunctionCallExample.v#L624

Note that a number of tactics are used in this proof. These tactics mainly serve as glue between different means of addressing memory; namely "anybytes" "Scalar" and the Fiat-specific "Bignum".

Stackallocation uses "anybytes" Store uses "Scalar" and the specifications of Fiat-generated functions are phrased in terms of "Bignum".

I was hoping that someone (@andres-erbsen, @JasonGross ?) could comment on the viability of using this approach. Does it comply with the intended use of Bedrock2/Fiat-crypto or is there an easier/better way of doing this?

A slightly simpler example can be found here: https://github.com/RasmusHoldsbjergCSAU/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Examples/quadratic_extension_bls12.v This is an implementation of addition and multiplication in the quadratic field extension of F_m. I say that it is simpler because we do not need to store any constants, as all the data needed is passed to the function as arguments. Note that code is shared between the files "FunctionCallExample.v" and "quadratic_extension.bls12.v"

A number of technical lemmas are used as well. Ideally, these would be refactored into apriopriate repos, as is done here on my own fork of Fiat-crypto: https://github.com/RasmusHoldsbjergCSAU/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Generic/Bignum.v

If there is interest, I will gladly submit a pull request once these things have been sorted out.

andres-erbsen commented 3 years ago

The bedrock2 code and spec look reasonable, I would most likely have done something very similar. I might have represented Fp2 elements using two back-to-back arrays (pointed to by a single pointer), but I doubt it matters much.

I was really hoping users of bedrock2 wouldn't need to directly manipulate msplit and use lemmas like split_distr. That you needed them is a sign that our lemmas at the separation logic level are somehow lacking. Could you describe what high-level proof task led you to the goals that you used direct map manipulation for? If it's stack allocation, maybe https://github.com/mit-plv/bedrock2/blob/f99cc7e01af7c6720791065ec92715309bc4ba36/bedrock2/src/bedrock2/ProgramLogic.v#L133-L154 will help you out...

I don't understand the technical lemmas, though -- array used in the definition of Bignum is already a Fixpoint very similar to what you define as many_Scalars, although if you wanted to factor out the length predicate that would be fine. Or is there something else going on? In general, such changes that are useful to you would be welcome as a PR against the same file.

(*Why let-bindings? Are they necessary?*)

They are used to generate names for variables in your proof context and to parse integers in a different grammar entry than the code. I think now that ltac2 is merged we could implement something that would generate sane context variable without needing the lets, but the latter still stands https://github.com/coq/coq/issues/9516

RasmusHoldsbjergCSAU commented 3 years ago

Thanks for the response!

The split_distr lemma was indeed written to help with stack allocation; I had some problems getting stragihtline_stackalloc to work, but now, having experimented a bit more with it, it seems to work fine; and it would be nice to avoid having to deal with maps at the level of abstraction of split_distr altogether.

and yes, manyScalars is very similar to Bignum, but in some cases I found it more convenient to work with a fixpoint that was recursive on the length of the list rather than the list itself.

With respect to the function specs, someone pointed out a problem that I have been trying to solve: At the current state, the preconditions of functions like Fp2_add require that inputs and outputs are separated in memory. This is too strict, as ideally I would want to allow overlapping inputs and outputs. I noticed that the same problem is commented on here: https://github.com/mit-plv/fiat-crypto/blob/master/src/Bedrock/Field/Synthesis/Examples/MulTwice.v#L62

Is this something that is being worked on? Or do you have any suggestions on how to approach this?

The Bedrock2 functions generated by Fiat-Crypto does not seem to have this problem; here the specs make use of several separation logic preconditions, to allow for inputs that are equal, disjoint or partially overlapping.

I have tried to write a wrapper function for Fp2_add that copies input arrays into freshly allocated memory, to ensure that the precondition of Fp2_add is met, but the proofs and tactics I have used so far do not seem to generalize well to the case with multiple separation logic preconditions.

andres-erbsen commented 3 years ago

I think you're right that we haven't actually used bedrock2 program logic to verify code works on aliased memory. The trick of maintaining multiple separation-logic decompositions of the memory should work, but I'm not surprised that the automation does not handle it yet. I had totally forgotten about this, but yes, I would like to make it work in bedrock2 automation. How soon would this need to happen to be of use to you?

andres-erbsen commented 3 years ago

Also, I recently merged a less ugly notation for function specifications :) https://github.com/mit-plv/bedrock2/commit/3ece613c19ff128290a5faa05c23f7fe8909d686#diff-6d97154a8b580676c384764c939b75c8ca676d82811982476281fde94880ccfbR28

andres-erbsen commented 3 years ago

I added a very simple example with memory aliasing at https://github.com/mit-plv/bedrock2/blob/master/bedrock2/src/bedrock2Examples/indirect_add.v It seems that the automation worked well enough in this case.

On a second thought, I think mul_twice does not work the same when the inputs overlap.

If out and y are disjoint, you get let out := x * y in let out := out * y in out which evaluates to (x*y)*y
If out is an alias of y, you get let y := x * y in let y := y * y in y which evaluates to (x*y)*(x*y) In case that's the behavior you want to verify, I think we can't do much better than writing two proofs about the same function.

If you have a more interesting example where the same specification holds regardless of aliasing, let me know :).

RasmusHoldsbjergCSAU commented 3 years ago

New notation looks nice : )

The functions generated by Fiat-Crypto seem to work regardless of aliasing; although they are a lot more complicated.

I have tried doing something similar here: https://github.com/RasmusHoldsbjergCSAU/bedrock2/blob/master/bedrock2/src/bedrock2Examples/indirect_add.v#L72

By storing the output of the firs addition in an intermediate variable, the specification should hold in any case, but I am not sure how to prove it. Note that the specification has also been altered slightly.

andres-erbsen commented 3 years ago

I don't have a solution for this one.

I did replicate your attempts at https://github.com/mit-plv/bedrock2/commit/c167c2f66f3d519417622cf110973d7836c52229 and put in my best effort to localize the challenge. Here is what I think is the core of this challenge: if you have multiple separation-logic hypotheses in your context before symbolically executing the function call, each separation-logic precondition of that function gets instantiated with one of your hypotheses (not necessarily using all of them). So far, so good. However, indirect_add has only one separation-logic decomposition in the postcondition, and it is based on only one of the separation-logic preconditions -- so after the function call, you have only one separation-logic decomposition of the memory. Sometimes, this is all you could prove anyway. Yet in this case, the choice of which separation-logic decomposition to use to fill in the precondition of indirect_add is underconstrained, but this choice forces values of evars that appear again in the postcondition. So it appears that when you write to the temporary stack location you get to preserve any one of the pointers that are disjoint from it but may mutually alias, whereas real execution actually preserves all of them.

One avenue for attempting to overcome this would be to put a non-separating conjunction (/\, not *) in the separation-logic precondition of the caller, so that when it gets to indirect_add the symbolic state would be stack * (a /\ b /\ c).

andres-erbsen commented 3 years ago

Here is a sketch of what using non-separating conjunction would look like. The admits are about pushing * under /\, I think that should be a generally-provable lemma. Does this approach look like it would be satisfying to you? https://github.com/mit-plv/bedrock2/blob/9ef83dffbfe87360865c1b8a56de579a004943f1/bedrock2/src/bedrock2Examples/indirect_add.v#L145-L210

RasmusHoldsbjergCSAU commented 3 years ago

Yes, this looks great!

cpitclaudel commented 3 years ago

FTR, I made a Rupicola version of @andres-erbsen's example at https://github.com/mit-plv/rupicola/blob/master/src/Rupicola/Examples/Cells/IndirectAdd.v