Open sonmarcho opened 4 months ago
I made some experiments with the proofs of the hashmap and the avl. Something I noticed is that by default the maxDischargeDepth
is 2 which is actually a lot and should not be necessary in most situations. I tried using a maxDischargeDepth
of 1 and was able to do so in all the proofs of the AVL, and 109 calls to simp_all
out of 129 for the hashmap (without modifying the proofs). The consequence is that I reduced the verification time from ~200s in both cases to ~135s for the hashmap and ~160s for the AVL, which is quite big.
Generally speaking, I believe we need a general tactic to simplify the context, and simp_all
may not be what we want. For instance, if I write simp_all [alloc.vec.Vec.new]
(in the proof of the hashmap), what I really want is probably more something like: simp (config := {maxDischargeDepth := 1}) [alloc.vec.Vec.new]
at *; simp_all config := {maxDischargeDepth := 1} only [LEMMAS_TO_SIMPLIFY_PROPOSITIONAL_LOGIC]`, and the latter is likely a lot more performant than the former.
I also made some measurements (before doing the modifications above). For instance, the graph below shows the accumulated time spent on evaluating the tactics of the proofs of the hashmap, when ranking the tactic times from the most expensive to the least expensive. There are some caveats (the times were printed with set_option profiler true
and doesn't only print the time spent by evaluating tactics) and this was measured before doing the modifications I mentioned above. But one interesting thing to notice is that ~500 tactic calls (< 10% of all the calls) are responsible for ~150s of proof time (> 66% of the time spent), so there are probably a few spots that we can try to investigate to save a lot of verification time.
A tactic which is also worth investigating is scalar_tac
. An obvious place where to look is the case disjunctions it performs. For instance, whenever it sees an instance of Scalar.toNat x
it introduces the disjunction (Scalar.toNat x : Int) = x.val \/ (Scalar.toNat x : Int) = -x.val
(if we convert x
to a nat and lift this to an integer, then it is equal to x.val
or -x.val
), then does a case disjunction on it (or rather, let omega
do a case disjunction). It can be expensive and there are places where this is not necessary. For instance, if x
is an unsigned scalar (Usize,
U32, etc.), then
x.toNat` is always positive and we can eliminate one of the cases.
Another big issue is the theorems which use termination_by
and decreasing_by
clauses: after the proof is done, Lean spends an insane amount of time on the decreasing_by
clause. This is a general Lean issue (not specific to Aeneas) and my intuition is that it needs to do so because it has to re-encode the whole theorem (and its proof) to use the proof of termination, so in the end it must do the proof twice. But it is worth investigating and seeing how we can improve that, either by pushing changes to the Lean developers, and developing strategies to minimize the overhead.
The proofs in the Lean backend are a bit slow for my taste. For instance, until recently, the proofs of the hashmap and the AVL took ~200s each, which is quite a lot. A more reasonable target (according to my intuition) would be less than a minute.
One issue for instance is the simplifier.
The simplifier can be slow at times especially when using
simp_all
in contexts with big goals. I did some preliminary diagnostics which indicate that it can sometimes perform up to ~10k simplification lemma attempts, which is not reasonable. We should investigate this.In parallel, this seems to indicate that
simp_all
is maybe not the best way of "blasting" a proof: for instance, I often use it when what I actually need to do is some sort of congruence closure. Maybe we should investigate other proof styles in consequence.