Leakage traces - Githubissues

OwenConoly commented 2 months ago

This PR will add leakage traces to the source and intermediate semantics, augment the compiler-correctness statement and proof to talk about leakage traces, and add some examples involving leakage traces.

OwenConoly commented 2 months ago

For now, this branch only has some edits to Semantics.v. Before I add anything else, I wanted to ask whether I should be making these edits (i.e., adding leakage traces) in Semantics.v, MetricSemantics.v, or both? Also, why are Semantics.v and MetricSemantics.v separate, and are there plans to merge them eventually?

OwenConoly commented 2 months ago

I hope it won't be too hard to state and prove the appropriate relationships between these variants.

Proving the appropriate relationship between Semantics.v and LeakageSemantics.v will only be trivial if we modify Semantics.v so that exec takes the pick_sp parameter. If LeakageSemantics exec takes pick_sp as a parameter but Semantics exec stays as is, then proving any sort of equivalence between them is equally as nontrivial as that semantics-equivalence statement we put in the paper, the one that I wrote a ~3000-line proof of.

So, should I modify Semantics and MetricSemantics execs to take pick_sp as a parameter, and then prove the appropriate relationships between Semantics vs. MetricLeakageSemantics, MetricSemantics vs. MetricLeakageSemantics, and LeakageSemantics vs MetricLeakageSemantics?

OwenConoly commented 2 months ago

However, the Semantics-exec-implies-LeakageSemantics-exec direction is trivial even if LeakageSemantics takes pick_sp as a parameter and Semantics doesn't. So if that direction of the equivalence is all we care about, then potentially I could just leave Semantics and MetricSemantics alone.

andres-erbsen commented 2 months ago

Ah, indeed. I was only thinking of the easy direction, Semantics implies LeakageSemantics. I don't think we should add pick_sp to current semantics. @samuelgruetter do you think it would be acceptable to carry both copies in the repo, but only the half of the equivalence proof that allows program proofs without leakage to be used with a compiler proof that preserves everything they say about leakage?

However, the other direction of this relationship might be a good opportunity to drill down on the difficulty in the equivalence proof and perhaps simplify it. We don't have to do this, but I'm thinking that if you could state the desired equivalence with a minimized version of semantics that is just complicated enough to illustrate the challenge (without abstract traces, predictors, or even function calls and etc), then I would give it a try. "How hard can pushing a quantifier into an inductive be" :rocket::joy::roll_eyes:

samuelgruetter commented 2 months ago

Ah, indeed. I was only thinking of the easy direction, Semantics implies LeakageSemantics. I don't think we should add pick_sp to current semantics. @samuelgruetter do you think it would be acceptable to carry both copies in the repo, but only the half of the equivalence proof that allows program proofs without leakage to be used with a compiler proof that preserves everything they say about leakage?

Sounds good!

However, the other direction of this relationship might be a good opportunity to drill down on the difficulty in the equivalence proof and perhaps simplify it. We don't have to do this, but I'm thinking that if you could state the desired equivalence with a minimized version of semantics that is just complicated enough to illustrate the challenge (without abstract traces, predictors, or even function calls and etc), then I would give it a try "How hard can pushing a quantifier into an inductive be" 🚀😂🙄

This kind of proof would be interesting as a minimal standalone file that might accompany a paper as "supplemental material", where the focus is really to make it as simple as possible, but without simplifying away the core difficulty. (And if at some point, we actually need this direction of the equivalence for real, we can again scale it up to the whole language).

OwenConoly commented 2 months ago

"How hard can pushing a quantifier into an inductive be"

This could be nice, yeah! To make it as academic as possible (and remove the coqutil/bedrock2 dependency and get a standalone file) I'd probably make the following simplifications to the equivalence proof I currently have:

no memory, only locals
no leakage traces or metrics either
locals are nats instead of words (in addition to removing a dependency, this makes me happy because proofs seem less trivial when they're written about Turing-complete languages)
no ifs, since you can write them with whiles anyway
instead of stack allocation, the quantifier is just an RNG construct
no function calls

I'd probably keep IO calls though, since the nondeterminism there seems to add an interesting dimension.

OwenConoly commented 2 months ago

Oh, bedrock2 is Turing-complete! I forgot that you can store and retrieve as much information as you want in the IO trace (with an appropriate choice of ext_spec). That's kinda cool.

I had thought that it was just a finite-state machine, so in principle, you could prove some absolute upper bound (depending only on word size and source program) on the size of an exec proof tree (and then this sort of equivalence theorem becomes trivial).

andres-erbsen commented 2 months ago

:+1: You can store a turing-machine tape in a nat, so I think all options here are turing-complete. And I'd even remove I/O at first, we can add it back later if we want.

OwenConoly commented 2 months ago

The existence of (Metric)LeakageSemantics.v implies the existence of:

(Metric)LeakageLoops.v, (Metric)LeakageProgramLogic.v, (Metric)LeakageWeakestPreconditionProperties.v, and (Metric)LeakageWeakestPrecondition.v,

right?

andres-erbsen commented 2 months ago

Yes. At some point I may try to share more code, but for new these are just copied I think.

samuelgruetter commented 2 months ago

The existence of (Metric)LeakageSemantics.v implies the existence of:

(Metric)LeakageLoops.v, (Metric)LeakageProgramLogic.v, (Metric)LeakageWeakestPreconditionProperties.v, and (Metric)LeakageWeakestPrecondition.v,

right?

Will we ever need a program logic that does both metrics and leakage at the same time? If not, the needed files should probably just be

MetricLeakageSemantics.v (because in the compiler, we want both at the same time)
(non-metric) LeakageLoops.v, LeakageProgramLogic.v, LeakageWeakestPreconditionProperties.v, and LeakageWeakestPrecondition.v

andres-erbsen commented 2 months ago

I don't have specific code planned where this would be needed, but I imagine basic library functions like memset could end up being used by callers that need either kind of spec. I'm fine with procrastinating on building support for this, though.

OwenConoly commented 2 months ago

Ok. I can just do the MetricLeakage files while I'm at it, if they'll be needed eventually anyway. (Probably I'll end up making some errors in MetricLeakageProgramLogic, since there won't be any test cases for it.)

Unrelatedly, since I am adding the separate files for Leakage stuff instead of changing what was already there, my changes to the source language shouldn't break the compiler proof. So it seems natural to split this PR into two: in this PR I can just change the source language and add some examples, and then I can make a separate PR for compiler proof things. Does that sound good?

andres-erbsen commented 2 months ago

Separate PRs sounds good to me. (There is some possibility that we will want to review them together, though.)

OwenConoly commented 4 weeks ago

Unfortunately, my attempt at writing my compiler-proof leakage-transformation functions without CPS did not work out nicely, for two reasons:

It's convenient to be able to do some error-handling-ish logic, where hitting a certain thing in the recursion causes the function to exit immediately. This is kind of annoying to do without CPS: the solution I found was to carry around a flag saying 'have we hit an error yet'. This isn't a big problem though, and on its own probably wouldn't make using CPS worth it.
The bigger issue is that, because we have the deterministic stack allocation addresses, our inductive hypothesis for the compiler phase proofs needs to say that the high-level pick_sp function is of a particular form. With the function in CPS, there's a natural way to say what this form is, and it's easy to prove given the recursive structure of the continuation-passing function. With the function not in CPS, it is not so convenient to say what it should be. Basically, the flexibility lost is that in CPS, we can just state that the high-level pick_sp function should look like the leakage transformation function, but it doesn't matter what the continuation is.

Amusingly, this second issue is a non-issue if we work with nondeterministic stackalloc addresses. This made me assume it wouldn't be an issue with the deterministic stackalloc addresses, but apparently I was wrong.

I probably didn't explain any of that well enough to be understandable; I could elaborate if any of it is interesting. But anyway, since I have to use CPS now, I have a need to write a non-structurally-recursive function which takes a function as an argument. I think there are three options for how to prove the fixpoint equation for such a function:

Use a functional extensionality axiom.
Prove some analogue of Fix_eq that's parameterized over arbitrary equivalence relations instead of just equality.
Do what is being done in fiat/src/Common/Wf1.v. I don't know exactly what is going on there, but it looks like it solves this problem?

I already did option (2), so I'm inclined to just stick with that unless there are compelling reasons for doing something else. I imagine there are no good reasons for doing (1), but I'm not sure about (3).

JasonGross commented 4 weeks ago

3. Do what is being done in fiat/src/Common/Wf1.v. I don't know exactly what is going on there, but it looks like it solves this problem?

I already did option (2), so I'm inclined to just stick with that unless there are compelling reasons for doing something else. I imagine there are no good reasons for doing (1), but I'm not sure about (3).

I'm not sure. Wf1 is a complicated general wrapper around the simple trick that if you know how many arguments your recursive function takes, you can demand pointwise extensionality rather than equality, and the induction goes through. It sounds like what you did is somewhat more general than this (though note that you maybe want to parameterize over a PER instead of an equivalence, if you want to avoid funext?)

andres-erbsen commented 3 weeks ago

I don't have opinions on this, feel free to do whatever.

mit-plv / bedrock2

Leakage traces #431