FStarLang / FStar

A Proof-oriented Programming Language
https://fstar-lang.org
Apache License 2.0
2.68k stars 232 forks source link

Z3 triggers a lemma even in the absence of required pattern. #1156

Open aseemr opened 7 years ago

aseemr commented 7 years ago

Given that client code is mostly written using sel and modifies, it would be better to change the spec of ST.write to be:

val op_Colon_Equals (#a:Type) (#rel:preorder a) (r:mref a rel) (v:a)
  : ST unit
    (fun h -> rel (sel h r) v)
    (fun h0 x h1 -> rel (sel h0 r) v /\ h0 `contains` r
               /\ modifies (singleton r) h0 h1
               /\ sel h1 r == v)
aseemr commented 7 years ago

Also bring back the nice { } syntax for modifies clauses.

nikswamy commented 7 years ago

Some context:

A bit of concrete API design that should help with SMT automation and may explain some long-standing weirdness with proofs of modifies clauses. Heads up @aseem, @kyod, @jroesch, @taramana et al., as this impacts FStar.ST, FStar.HyperStack.ST etc.

27 replies nik [21 hours ago] The program below fails to verify:

let test (x0:ref int)
         (x1:ref int)
         (x2:ref int)         
         (h0 h1 h2 h3:heap) =
   assume (h0 `contains` x0);
   assume (h0 `contains` x1);
   assume (h0 `contains` x2);   
   assume (modifies_t (singleton (addr_of x0)) h0 h1);
   assume (h2 == upd h1 x1 0);
   //assert (modifies_t (singleton (addr_of x1)) h1 h2);
   assume (modifies_t (singleton (addr_of x2)) h2 h3);
   assert (modifies_t (union (singleton (addr_of x0))
                      (union (singleton (addr_of x1))
                             (singleton (addr_of x2))))
                      h0 h3)

But, if you uncomment the assertion it succeeds.

What's going on is that FStar.Monotonic.Heap.modifies_t has quantifiers that are triggered on occurrences of sel h r. Without the assertion, the VC of the program above has no terms of the form sel h1 r, so the proof of transitivtity cannot be completed.

This suggests that the type of FStar.ST.op_Colon_Equals should be

val op_Colon_Equals (#a:Type) (#rel:preorder a) (r:mref a rel) (v:a)
  : ST unit
    (fun h -> rel (sel h r) v)
    (fun h0 x h1 -> rel (sel h0 r) v /\ h0 `contains` r /\ h1 == upd h0 r v 
               (* NEW *) /\ modifies (singleton r) h0 h1)

Or perhaps even this, which is more in the style of most stateful specs we write elsewhere, where the state is described using sel and modifies rather than upd.

val op_Colon_Equals (#a:Type) (#rel:preorder a) (r:mref a rel) (v:a)
  : ST unit
    (fun h -> rel (sel h r) v)
    (fun h0 x h1 -> rel (sel h0 r) v /\ h0 `contains` r
               (* NEW *) /\ modifies (singleton r) h0 h1
               (* NEW *) /\ sel h1 r == v)

It's worth keeping this in mind for the specs of corresponding functions (even things like new_region etc.) in our other libraries providing stateful primitives.

aseem [15 hours ago] the proof goes through for me

aseem [15 hours ago] even with the assertion commented out

aseem [15 hours ago]

aseemr@aseemr-dt1 ~/FStar
$ fstar.exe --version
F* 0.9.4.3
platform=Windows_x64
compiler=OCaml 4.02.3
date=2017-07-25T09:50:56+05:30
commit=cdaff635c

aseem [15 hours ago] the lemma lemma_contains_upd_modifies provides the modifies clause for upd, i think

aseem [15 hours ago] works with latest master too: 498e1e5b8

aseem [15 hours ago] but your proposal reminds me of a discussion we had long time back about removing upd altogether, so perhaps this a step towards it (removing upd from the client facing specs)

catalin [11 hours ago] the flaky support for upd is quite a pain when teaching people how to verify simple imperative programs, so wouldn't be that sad to see it replaced with something more reliable ... at the same time the sel-upd theory is something very fundamental to verification, and I don't yet know with what it would be replaced under this proposal and how easy it would be to teach that

catalin [11 hours ago] I guess it's some sort of sel-modify theory ... where can I find it though?

catalin [11 hours ago] (another thing I stumbled on in the Nancy school was the super verbose syntax for modifies clauses ... the nice set syntax is gone for good?) (edited)

aseem [11 hours ago] do you have examples of flakiness of upd?

aseem [11 hours ago] yes, that's essentially the idea ... sel-modify theory, and it gels well with how stateful specs are written in mitls or hacl*, upd is not very useful for functions that modify multiple references

aseem [11 hours ago] so, nik's idea should definitely make things more streamlined w.r.t. existing specs

aseem [11 hours ago] re. the syntax, agreed, will add it some time (it went away because of the shift from TSet to Set for modifies clauses, but i can change it)

catalin [11 hours ago] I saw upd flakiness for an exercise in which I asked students to prove an imperative factorial correct wrt a functional one; it's a combination of upd and nonlinear arithmetic that killed it I guess; the solution involving upd was succeeding with --verify_all and failing without, so super flaky here is the solution that worked better https://github.com/catalin-hritcu/fstar-course-nancy2017/blob/master/sol/FactorialST.fst (edited)

catalin [11 hours ago] anyway, this exercise is not so nice, and after the school I made a replacement that uses fibonacci https://github.com/catalin-hritcu/fstar-course-nancy2017/blob/master/code/03/FibonacciST.fst

aseem [11 hours ago] thanks, and you were writing the specs in these functions using upd, which was flaky?

aseem [11 hours ago] these examples in the current form go through (using the current ST.write spec that uses upd) (edited)

catalin [11 hours ago] right, things weren't working if we were specifying factorial ourselves in terms of upd, i.e. h1 == upd h0 ... (edited)

catalin [11 hours ago] which is very intuitive given the types of ST.write and ST.read, which I do show to the students

catalin [11 hours ago] if ST.write and ST.read would also be in terms of modifies it would be a different business, I wouldn't need to introduce upd at all

aseem [11 hours ago] right, makes sense

aseem [11 hours ago] (fwiw, the upd spec fails because we don't have extensional equality of heaps)

catalin [11 hours ago] right, I remember about that, and it's what I told students could be the source of flakyness ... still it doesn't explain why we use upd for write and read given that it's an anti-pattern in user code (edited)

aseem [11 hours ago] yep

aseem [10 hours ago] https://github.com/FStarLang/FStar/issues/1156

nik [1 minute ago] Weird: it verifies with Z3-4.5.1 but not with Z3-4.5.0

nik [6 hours ago] I actually don't understand how it could verify, since the lemma that Aseem mentions lemma_contains_upd_modifies has a trigger that includes a contains, which should not fire in this case (edited)

nik [6 hours ago] Can anyone explain why the proof succeeds with Z3-4.5.1? @wintersteiger ?

nik [4 minutes ago] The plot thickens: Indeed, the proof in z3-4.5.1 is brittle. If you generate a hint, it will fail to replay (edited)

nik [2 minutes ago] Even if you add the assertion, the unsat core reported by Z3-4.5.1 still fails to replay

nik [1 minute ago] OTOH, with z3-4.5.0, the proof fails without the assertion (as I originally reported). With the assertion, the proof succeeds and the hint replays, suggesting that the proof is robust and Z3 is finding the proof that I expect it to find.

nik [< 1 minute ago] An additional wrinkle is that the unsat core found by Z3 4.5.0 is also replayable by Z3-4.5.1

nik [< 1 minute ago] So, now I'm really curious to understand what proof Z3-4.5.1 is finding

aseemr commented 7 years ago

Fixed with 825188d. But keeping it open for further investigation by Christoph about Z3.

wintersteiger commented 7 years ago

I'm looking at @nikswamy's program in the third comment on this issue. Is that the one that fails with z3 4.5.0 but not with z3 4.5.1? For me, neither of those work because F says: ./gh1156.fst(3,13-3,16): (Error) Identifier not found: [ref]. Do I have to downgrade F to some previous state to reproduce the problem?

wintersteiger commented 7 years ago

Ah!

open FStar.Heap
open FStar.ST
open FStar.Set

got me closer, but now I get:

./gh1156.fst(8,0-22,28): (Error) Unknown assertion failed
./gh1156.fst(15,22-15,46): (Error) Expected type "FStar.Monotonic.Heap.tset Prims.nat"; but "FStar.Set.singleton (FStar.Monotonic.Heap.addr_of x0)" has type "FStar.Set.set ((*?u485*) _ x0 x1 x2 h0 h1 h2 h3)"
./gh1156.fst(18,22-18,46): (Error) Expected type "FStar.Monotonic.Heap.tset Prims.nat"; but "FStar.Set.singleton (FStar.Monotonic.Heap.addr_of x2)" has type "FStar.Set.set ((*?u512*) _ x0 x1 x2 h0 h1 h2 h3)"
./gh1156.fst(19,22-21,55): (Error) Expected type "FStar.Monotonic.Heap.tset Prims.nat"; but "FStar.Set.union (FStar.Set.singleton (FStar.Monotonic.Heap.addr_of x0))
  (FStar.Set.union (FStar.Set.singleton (FStar.Monotonic.Heap.addr_of x1))
      (FStar.Set.singleton (FStar.Monotonic.Heap.addr_of x2)))" has type "FStar.Set.set ((*?u574*) _ x0 x1 x2 h0 h1 h2 h3)"
Verified module: GH1156 (1561 milliseconds)
4 errors were reported (see above)

Are those errors expected?

wintersteiger commented 7 years ago

TSet instead of Set gets rid of the errors, but it verifies just fine using both versions of Z3, and FStar.ST.fst doesn't have a "val op_Colon_Equals", only "abstract let op_Colon_Equals" (and write). @nikswamy Help! How do I get the query I want?

catalin-hritcu commented 7 years ago

@wintersteiger I guess you need to go back in time 5 days before this issue was opened, since in the meanwhile @aseemr changed things quite a bit.

nikswamy commented 7 years ago

I can reproduce it and will post a .smt2 file here

nikswamy commented 7 years ago

queries-Mods.smt2.txt

This .smt2 file is proven unsat by Z3 version 4.5.1 - 64 bit - build hashcode 1f29cebd4df6 but is unknown using Z3 version 4.5.0 - 64 bit

Note, the unsat core reported by 4.5.1 is not replayable.

Here is the F* file from which I produced the .smt2, by running fstar --log_queries --hint_info mods.fst

module Mods
open FStar.Heap
open FStar.TSet
#set-options "--lax"
let singleton (r:ref int) = singleton (addr_of r)
let upd (h:heap) (r:ref int) (x:int) = upd h r x
let contains (h:heap) (r:ref int) = h `contains` r
#reset-options
let test (x0:ref int)
         (x1:ref int)
         (x2:ref int)         
         (h0 h1 h2 h3:heap) =
   assume (h0 `contains` x0);
   assume (h0 `contains` x1);
   assume (h0 `contains` x2);   
   assume (modifies_t (singleton x0) h0 h1);
   assume (h2 == upd h1 x1 0);
   //assert (modifies_t (singleton ( x1)) h1 h2);
   assume (modifies_t (singleton ( x2)) h2 h3);
   assert (modifies_t (union (singleton ( x0))
                      (union (singleton ( x1))
                             (singleton ( x2))))
                      h0 h3)

with this patch to FStar.Monotonic.Heap.fsti

$ git diff FStar.Monotonic.Heap.fsti
diff --git a/ulib/FStar.Monotonic.Heap.fsti b/ulib/FStar.Monotonic.Heap.fsti
index 723b6b4..42b1a77 100644
--- a/ulib/FStar.Monotonic.Heap.fsti
+++ b/ulib/FStar.Monotonic.Heap.fsti
@@ -204,7 +204,7 @@ val lemma_upd_unused
 val lemma_contains_upd_modifies (#a:Type0) (#rel:preorder a) (h:heap) (r:mref a rel) (x:a{valid_upd h r x})
   :Lemma (requires (h `contains` r))
          (ensures  (modifies (S.singleton (addr_of r)) h (upd h r x)))
-         [SMTPat (upd h r x)]
+         [SMTPat (upd h r x); SMTPat (h `contains` r)]

 val lemma_unused_upd_modifies (#a:Type0) (#rel:preorder a) (h:heap) (r:mref a rel) (x:a{valid_upd h r x})
   :Lemma (requires (r `unused_in` h))
wintersteiger commented 7 years ago

inst.old.txt inst.new.txt

I'm attaching two log files of the quantifier instantiation on that problem. The input to the final solver (after tactics, simplifiers, etc) is identical in Z3 4.5.0 and Z3 4.5.1, but some terms are numbered slightly differently. The final solver of the two Z3's behave exactly the same for about half of the runtime, but then the search diverges and they instantiate different quantifiers with different terms. In the log file new instance of -quantifier-name- means that it matched a quantifier and added it to the queue. The actual instance is then added later (look for new instance:). The first meaningful difference is at old line 17682/new line 17695 where 4.5.0 adds:

(let ((a!1 (HasTypeFuel (SFuel ZFuel)
                        (ApplyTT (ApplyTT (ApplyTT FStar.Heap.trivial_preorder@tok
                                                   Prims.int)
                                          Prims.int)
                                 Prims.int)
                        (FStar.Preorder.relation (PreType (BoxInt 0)))))
      (a!2 (HasTypeFuel ZFuel
                        (ApplyTT (ApplyTT (ApplyTT FStar.Heap.trivial_preorder@tok
                                                   Prims.int)
                                          Prims.int)
                                 Prims.int)
                        (FStar.Preorder.relation (PreType (BoxInt 0))))))
  (= a!1 a!2))

while 4.5.1 adds this:

(l17696et ((a!1 (@x2!842 (PreType (BoxInt 0))
                    (ApplyTT (ApplyTT (ApplyTT FStar.Heap.trivial_preorder@tok
                                               Prims.int)
                                      Prims.int)
                             Prims.int))))
  (= (HasTypeFuel (SFuel ZFuel) a!1 (PreType (BoxInt 0)))
     (HasTypeFuel ZFuel a!1 (PreType (BoxInt 0)))))

From there on in, they behave differently, finding lots of different instances. It's too much to trawl through and check those instances are all sound, but so far I see no reason to doubt that. @nikswamy are you suspicious because of instances from a particular quantifier that it shouldn't find?

I think 4.5.0 may just have given up too early, which I'm taking a closer look at anyways, but on smaller queries for now.

nikswamy commented 7 years ago

Yes, I am suspicious because the proof requires triggering the lemma lemma_unused_upd_modifies for which the term h1 contains x1 is needed, but there's nothing that's actually providing that term.

What about the unreplayable unsat core aspect of this issue: Z3-4.5.1 find an unsat core, but then is unable to prove the unsat core unsatisfiable, again suggesting that some weird triggering behavior is going on.

wintersteiger commented 7 years ago

Oh yes, there's definitely some undesirable triggering behavior going on. Since the goals after preprocessing, etc, are the same, I suspect that the search went off in 4.5.0 and 4.5.1 got lucky with respect to the term ordering etc. Thanks for the pointer, that should make it easy to pin-point exactly what is creating this problem.

wintersteiger commented 7 years ago

Weird! It claims it can do it without ever instantiating the lemma_unused_upd_modifies or lemma_unused_upd_contains quantifiers. Z3-4.5.0 instantiates the latter once. However, even weirder, when I replay all the quantifier instances one by one, it isn't able to derive the ground contradiction that should result. You may be onto a soundness bug with this one!

nikswamy commented 4 years ago

We should confirm if this is still the case with z3-4.8.5 and beyond