PrincetonUniversity / VST

Verified Software Toolchain
https://vst.cs.princeton.edu
Other
424 stars 91 forks source link

localize / entailer_for_load_tac / unlocalize / Coq 8.17 #756

Closed andrew-appel closed 3 months ago

andrew-appel commented 3 months ago

The localize / unlocalize tactics are behaving worse in Coq 8.17 than they were in Coq 8.16. The symptom is that certain hypotheses get cleared from above the line.

Initially I thought this was a bug in refine, and I reported this as Coq issue 18049. After some very diligent analysis by @JasonGross, I traced it to this:

Require Import VST.floyd.proofauto.
Goal TT |-- !! True.
auto.

In Coq 8.16, auto does not solve the goal, but simple apply @TT_right solves the goal. In Coq 8.17, auto does solve the goal, and info_auto reports simple apply @TT_right.

In fact, this goal can be solved either by simple apply @TT_right or by apply derives_refl. The former, it turns out, entangles all open existential variables with the current proof goal, and the latter does not. This means that in some other proof goal to be solved later, the refine tactic will clear a bunch of assumptions from above the line (if TT_right were used), or will not do so (if derives_refl were used here).

Well, it turns out that the unlocalize tactic uses refine; and refine is not really the culprit, the proof goal that's being solved by the refine is just not solvable if the existentials are entangled. And it turns out that entailer_for_load_tac uses quick_typecheck3 which, at some point, uses auto. In Coq 8.16 (and before), this auto did not apply TT_right, and I suspect the goal was solved some other way (perhaps with derives_refl). But in Coq 8.17 the auto applies TT_right.

An example of unlocalize where this matters is at line 162 of this file.

One solution that works in that case is to adjust quick_typecheck3 so that it does try apply derives_refl just before it does auto. That's not a very general solution, and it does illustrate that localize / unlocalize is not very robust. But a partial solution might be better than nothing.

andrew-appel commented 3 months ago

It is still the case that localize/unlocalize is not very robust, in the following sense. The way these tactics are used is,

localize [c].  (* where c is an mpred *)
forward.   (* forward symbolic execution for one or more statements, perhaps with other proofs mixed in *)
unlocalize [d].  (* where d is an mpred *)
(* now prove an entailment involving c and d *)

The problem is that if there are hypotheses above the line before the localize, that are needed in the entailment proof below the unlocalize; and if the forward symbolic execution proof does something with evars in a way that entangles things, then those hypotheses cannot be used where they are needed.

P.R. #757 patches things back to the status quo up to Coq 8.16. That is, a certain forward symbolic execution using load_tac had previously not entangled the evars; a change to the semantics of auto in Coq 8.17 now invoked a different lemma that caused entanglement; the patch prevents that in limited cases, so that several existing proofs in the CertiGraph project continue to work.

But it's not a very general solution. localize\unlocalize is still fragile, because the forward-symbolic-execution proof between them might entangle the evars.

JasonGross commented 3 months ago

The former, it turns out, entangles all open existential variables with the current proof goal,

What does this mean?

Maybe you want to clear all hypotheses that contain evars before running auto?

andrew-appel commented 3 months ago

I don't know what it means. I find all of this very mysterious. But I don't think your proposal would work. The solution I have adopted is to ensure, whenever possible, that the foward symbolic execution between localize and unlocalize does not instantiate any evars. In the limited situations where localize/unlocalize is used, this is typically just one or two forward statements that load from a record field.

JasonGross commented 3 months ago

I mean, auto should always either do nothing or solve the current goal. So it is safe to replace auto with try solve [ auto ]. Now we can write a tactic clear_evars := repeat match goal with H : ?T |- _ => has_evar T; tryif clear dependent H then idtac else fail 1 (* if we can't clear all evar-containing hypotheses then we better not keep going, because we might mess up dependencies *) end. Then it should be fine to replace try solve [ auto ] with try solve [ progress clear_evars; auto | auto ]; either we solve the goal after clearing evar-containing hypotheses, or we (probably) fail to solve the goal. This may be a bit slower, maybe we want to just do try solve [ clear_evars; auto ], and see if that works. Alternatively, we can do something like generalize over all evars, I can write that code for you if you want to see it, and it would probably work even better than clear_evars.

This is all running on the assumption that auto is the problematic tactic; is this assumption correct?

andrew-appel commented 3 months ago

The problem is that auto is now (in Coq 8.17) succeeding with a lemma from the hint database, because auto is a bit more powerful. Previously, in Coq 8.16, auto did not match as fully as simple apply, but now it does, IIRC. And the lemma that gets thereby applied is the one that causes the problem. I could solve the same goal with a different lemma (eq_refl) and there is no interference with evars.

JasonGross commented 3 months ago

What is the goal and context such that applying the lemma messes with evars? (Does the goal have evars? Is it using an assumption that has evars? Something else?)

And regardless, what if you used something like

Ltac set_evar ev := let e := fresh "e" in set (e := ev) in *.
(* Alternative if above is too slow *)
(*
Ltac set_evar ev :=
  let e := fresh "e" in
  pose ev as e;
  change ev with e;
  repeat match goal with
    | [ H : context[ev] |- _ ] => progress change ev with e in H
    | [ H := context[ev] |- _ ] => progress change ev with e in (value of H)
    end.
 *)
Ltac set_evars_in T :=
  repeat match T with
    | context[?e] => is_evar e; set_evar e
    end.
Ltac set_evars :=
  match goal with |- ?G => has_evar G; set_evars_in G end;
  repeat match goal with
    | [ H : ?T |- _ ] => has_evar T; set_evars_in T
    | [ H := ?T |- _ ]
      => tryif is_evar T
        then fail
        else (has_evar T; set_evars_in T)
    end.
Ltac generalize_evars :=
  set_evars;
  repeat match goal with
    | [ H := ?e |- _ ] => is_evar e; clearbody H
    end.

and then replaced auto with try solve [ generalize_evars; auto ]. This should preserve behavior on all goals that can be solved without unifying evars, and should refuse to unify evars otherwise.

andrew-appel commented 3 months ago

If you are super interested in this, I can spend an hour and produce instructions for reproducing this so you can see exactly where simple apply TT_right seems to entangle some evars in a way that apply derives_refl does not, with the proof goal, TT |-- !!True. Or maybe you can look at the type of TT_right in VST to guess more about what may be happening. But really, I'm willing to drop this, because localize/unlocalize is not super-important right now.

JasonGross commented 3 months ago

I think I don't have time to dig into it much more than I have, so we should just drop it if the Ltac I gave above for try solve [ generalize_evars; auto ] does not work.

For future reference, I ended up porting this to Ltac2, though, and will leave the Ltac2 code here

From Ltac2 Require Import Ltac2.
Ltac2 set_evars_in (t : constr) :=
  repeat (match! t with
          | context[?e] => if Constr.is_evar e then set $e in * else Control.backtrack_tactic_failure "not evar"
          end).
Ltac2 set_evars () :=
  set_evars_in (Control.goal ());
  List.iter
    (fun (_, body, typ)
     => match body with
        | Some body => if Constr.is_evar body then () else set_evars_in body
        | None => ()
        end;
        set_evars_in typ)
    (List.rev (Control.hyps ())). (* reverse hyps so that we deal with leaf evars before dealing with their dependencies *)
Ltac2 clearbody_evars () :=
  Std.clearbody
    (List.map
       (fun (h, _, _) => h)
       (List.filter
          (fun (h, body, _)
           => Option.map_default Constr.is_evar false body)
          (Control.hyps ()))).
Ltac2 generalize_evars () := set_evars (); clearbody_evars ().

(but it's a bit less robust than the Ltac1, I think)

andrew-appel commented 3 months ago

Thank you for the suggestion. The problem will be, I think, that it's not just a single call to auto that's the problem. The entire forward symbolic execution between the localize and the unlocalize would have to be protected this way; it could include multiple user calls to forward, entailer!, and other tactics. It's only a lucky accident that, in many typical cases, this symbolic execution has comprised only a single forward statement that has not been too complicated.

JasonGross commented 3 months ago

It's possible that doing set_evars at the beginning of localize and calling

Ltac subst_evars :=
  repeat match goal with H := ?e |- _ => is_evar e; subst H end.

at unlocalize would perform decently. Many tactics are less willing to mangle evars that are hidden behind a context variable. And unlike the generalize_evars solution, this one does not lose any information and so can be used even when we're not about to finish a goal.

andrew-appel commented 3 months ago

Remark 1

I adjusted the suggestion of @JasonGross to fix a bug and improve its features, as follows (but then see remark 2...)

Definition protect_evar {A} (x: A) := x.

Ltac set_evar_in T :=
  match T with context [?ev] => 
    is_evar ev;
    let e := fresh "e" in
    pose (e:= protect_evar ev);
    change ev with e;
    repeat match goal with
      | [ H : context[ev] |- _ ] => progress change ev with e in H
      | [ H := context[ev] |- _ ] => progress change ev with e in (value of H)
    end
  end.

Ltac set_evars :=
  repeat match goal with |- ?G => set_evar_in G end;
  repeat match goal with
    | [ H : ?T |- _ ] => set_evar_in T
    | [ H := ?T |- _ ]
      => tryif is_evar T then fail else
         lazymatch T with
         | protect_evar _ => fail
         | _ => set_evar_in T
         end
    end.

Ltac subst_evars :=
  repeat match goal with H := protect_evar ?e |- _ => red in H; subst H end.

Ltac localize R_L :=
  set_evars; old_localize R_L.

Tactic Notation "unlocalize" constr(R_G2) :=
  subst_evars; unlocalize_plain R_G2.

Remark 2.

It probably isn't necessary to search for all evars and protect them. There's one evar in particular that we care about, then one introduced by the eexists in the localize tactic. It seems to be fine just to protect that one. Easy, and fast. That's what I've done in P.R. #758. Instead of eexists, create a protected evar w := protect_evar (?ww), and then do exists w.