Deducteam / lambdapi

Proof assistant based on the λΠ-calculus modulo rewriting
Other
268 stars 35 forks source link

Bug in higher-order matching #225

Closed francoisthire closed 4 years ago

francoisthire commented 5 years ago

I am sorry, I don't have time to minimize the example. I am aware of a type checking issue in Lambdapi which is probably related to Higher-Order matching. The error can be found in the Librairies repository. There are two files in play. The first one encodes Cedille's core theory: cedille.dk . The second one encodes a basic example with it: nat.dk.

The first file type checks but not the second in Lambdapi (but it does in Dedukti and Dedukti is right on this one). I know that the error commes from either this rule or this rule which are both Higher-Order rules. One of these rules is not applied while it should.

gabrielhdt commented 5 years ago

The issue might come from the matching of an abstraction against an applied pattern variable. The current lambdapi rejects it as line 176 of eval.ml suggests.

GuillaumeGen commented 5 years ago

This issue suggests that even in the old syntax, LambdaPi performs syntactic matching rather than matching modulo beta.

francoisthire commented 5 years ago

Does your PR @gabrielhdt fixes this issue?

gabrielhdt commented 5 years ago

Hum for now it fails to build a tree, precisely because of this case. It should be easy to fix though.

gabrielhdt commented 5 years ago

The decision trees branch now behave like the master branch, that is, it raises

[cedille.dk, 46:0-51] Term [mk_iota cnat cz inat iz I] does not have type [Term star nat].
francoisthire commented 5 years ago

Alright, but the file is well-typed, so there is still a bug.

GuillaumeGen commented 5 years ago

The error message is weird, since the line which raises the problem is the line 46 of nat.dk, not of cedille.dk. The "file is well-typed" depends on the definition of matching. I think refusing this file could have been a reasonable choice, but both implementations must agree.

francoisthire commented 5 years ago

@GuillaumeGen Can you detail your comment? Why do you think it depends on the definition of "matching"?

rlepigre commented 5 years ago

I can try to have a look later, but you should definitely try to make the example more minimal (just inline nat.dk at the end of cedille.dk), and also convert it to Lambdapi syntax. The legacy syntax may not be translated accurately, and it would be easier to debug this way.

GuillaumeGen commented 5 years ago

As far as I understand, LambdaPi implements AFSM (or something close to it). Hence there are two kind of application, the usual one x y and the metavariable one X[y]. The metavariable one comes with arity constraint and in the lhs of a rule you have the pattern restriction (only distinct bound variables). Matching is performed syntactically on usual application and "modulo beta" on meta-variable application. I put "modulo beta" between quotes since the metavariables have a fixed arity, so it does not really create lambdas.

francoisthire commented 5 years ago

@rlepigre Here you go

symbol const Sort : TYPE

symbol const star : Sort

symbol const box : Sort

symbol Rule : Sort ⇒ Sort ⇒ Sort

rule Rule _ &s → &s

symbol const Univ : Sort ⇒ TYPE

symbol Term : ∀ (s : Sort), Univ s ⇒ TYPE

symbol const ustar : Univ box

rule Term _ ustar → Univ star

symbol prod :
  ∀ (s1 : Sort), ∀ (s2 : Sort), ∀ (A : Univ s1), (Term s1 A ⇒ Univ
  s2) ⇒ Univ (Rule s1 s2)

rule Term _ (prod &s1 &s2 &A &B) → ∀ (x : Term &s1 &A), Term &s2 (&B x)

symbol iprod :
  ∀ (s1 : Sort), ∀ (s2 : Sort), ∀ (A : Univ s1), (Term s1 A ⇒ Univ
  s2) ⇒ Univ (Rule s1 s2)

rule Term _ (iprod &s1 &s2 &A &B) → ∀ (x : Term &s1 &A), Term &s2 (&B x)

definition forall :
  ∀ (t : Univ star), (Term star t ⇒ Univ star) ⇒ Univ star ≔
  λ A, λ B, prod star star A B

definition iforall :
  ∀ (t : Univ star), (Term star t ⇒ Univ star) ⇒ Univ star ≔
  λ A, λ B, iprod star star A B

definition arrow : Univ star ⇒ Univ star ⇒ Univ star ≔
  λ A, λ B, forall A (λ __, B)

definition iforallk :
  ∀ (K : Univ box), (Term box K ⇒ Univ star) ⇒ Univ star ≔
  λ A, λ B, iprod box star A B

definition pi :
  ∀ (t : Univ star), (Term star t ⇒ Univ box) ⇒ Univ box ≔
  λ A, λ B, prod star box A B

symbol const True : TYPE

symbol const I : True

symbol const pure : TYPE

symbol const lam : (pure ⇒ pure) ⇒ pure

symbol Eq : pure ⇒ pure ⇒ TYPE

rule Eq &x &x → True

symbol code : ∀ (s : Sort), ∀ (t : Univ s), Term s t ⇒ pure

symbol uncode : ∀ (s : Sort), ∀ (t : Univ s), pure ⇒ Term s t

rule code _ (prod &s1 &s2 &A &B) (λ x, &f[x])
   → lam (λ (x : pure), code &s2 (&B (uncode &s1 &A x)) &f[uncode &s1 &A
         x])

symbol const dummy : pure

rule code _ (iprod &s1 &s2 &A &B) (λ x, &f[x])
   → code &s2 (&B (uncode &s1 &A dummy)) &f[uncode &s1 &A dummy]

rule code _ _ (uncode _ _ &x) → &x

symbol const iota :
  ∀ (t : Univ star), (Term star t ⇒ Univ star) ⇒ Univ star

symbol mk_iota :
  ∀ (t1 : Univ star), ∀ (t : Term star t1),
  ∀ (t2 : Term box (pi t1 (λ __, ustar))), ∀ (t' : Term star (t2 t)), Eq
  (code star t1 t) (code star (t2 t) t') ⇒ Term star (iota t1 t2)

definition cnat : Univ star ≔
  iforallk ustar (λ X, arrow X (arrow (arrow X X) X))

definition cz : Term star cnat ≔ λ X, λ z, λ s, z

definition cs : Term star (arrow cnat cnat) ≔
  λ n, λ X, λ z, λ s, s (n X z s)

definition inat : Term box (pi cnat (λ __, ustar)) ≔
  λ (x : Term star cnat),
  iforallk (pi cnat (λ __, ustar)) (λ P, arrow (P cz) (arrow (iforall cnat
    (λ m, arrow (P m) (P (cs m)))) (P x)))

definition iz : Term star (inat cz) ≔ λ P, λ pz, λ __, pz

definition nat : Univ star ≔ iota cnat inat

definition Z : Term star nat ≔ mk_iota cnat cz inat iz I
francoisthire commented 5 years ago

Maybe I should mention also that this example comes from the encoding of a type system called Cedille from Aaron Stump. It features mainly:

The original file I have provided in this issue aims to encode this logic. Nat is an example that this encoding seems to work. The main benefit of Cedille is to be able to derive inductive types. This is the theorem I proved in Nat.dk.

From my understanding, PML is not that far from Cedille. Maybe it is possible to have an encoding of PML in Dedukti then.

fblanqui commented 5 years ago

Remark: the file is checked if we replace the first rule defining code by

rule code _ (prod &s1 &s2 &A &B) &f
   → lam (λ (x : pure), code &s2 (&B (uncode &s1 &A x)) (&f (uncode &s1 &A x)))
francoisthire commented 5 years ago

So it is indeed a Higher-Order issue. In my case though, I don't want to use this rule since it does an eta-expansion.

fblanqui commented 5 years ago

No.

fblanqui commented 5 years ago

And this rule is more general since it applies even if the argument is not a lambda.

francoisthire commented 5 years ago

In Cedille it does ^^ .

fblanqui commented 5 years ago

How?

francoisthire commented 5 years ago

Because using this rule, you maybe have two terms f and \x. f~x which are not equal in Cedille, but their encoding will be because the encoding of f will match the left pattern, and hence, the encoding of both terms will be convertible.

fblanqui commented 5 years ago

The "normal form" computed by LP for code star cnat cz is:

lam (λx,
       code
         star
         (prod
           star
           star
           (prod
             star
             star
             (uncode box ustar dummy)
             (λ__, uncode box ustar dummy))
           (λ__, uncode box ustar dummy))
         (λs, uncode star (uncode box ustar dummy) x))

This is not a normal form. It seems that (λs, uncode star (uncode box ustar dummy) x) does not match λs,&f[s]. I propose the following explanation: λs, uncode star (uncode box ustar dummy) x is not recognized as a closed term since it contains the free variable x. What do you think @rlepigre ?

fblanqui commented 5 years ago

This is indeed the case as shown by the following example:

symbol T:TYPE
symbol a:T
symbol f:(T⇒T)⇒T
rule f (λx,&F[x]) → a
compute λx,f(λy,x) // does not return λx,a !
rlepigre commented 5 years ago

@fblanqui you are right about that, your explanation is correct. The environment [s] in &f[s] says that the only variable that can appear in a matching term is s.

fblanqui commented 5 years ago

Will it be easy to fix? It seems that we cannot use is_closed. Instead we need to check that some variables do not occur. Is it possible to do this efficiently with Bindlib?

rlepigre commented 5 years ago

But is that really a bug though? For me this looks like some variable is gonna escape its scope here.

francoisthire commented 5 years ago

If you have a rule

rule foo &x &y --> &x

it matches the term foo z z. The fact that z is free is not a problem right? You don't expect that the term that matches the variable &x is closed.

When you encounter the rule

rule foo (\x. &f[x]) --> f[0].

You cannot assume, for the same reason that any term which matches f[x] has only x has as a free variable. At least, this is not what it is intended when Miller's presented its restriction. And the current behavior implemented by Dedukti is the one implemented by every other languages supporting the Miller's pattern fragment I know.

gabrielhdt commented 5 years ago

I have a branch with dtrees that solves Frederic subproblem, but not Francois'

fblanqui commented 5 years ago

@rlepigre , this is the difference between programming languages and proof assistants: in proof assistants, we want to compute strong normal forms, hence reduce under lambda's.

fblanqui commented 5 years ago

@gabrielhdt , thanks to give a link to your branch.

gabrielhdt commented 5 years ago

https://github.com/gabrielhdt/lambdapi/tree/hobug Roughly, the idea is to memorize each lambda traversed and check that the free variables encountered not listed in the pattern variable should not appear in the matched term.

rlepigre commented 5 years ago

Sorry guys, I really don't have time to have a closer look right now, but I'll try to do that during the weekend or early next week. I had a quick look at your branch @gabrielhdt, and I think that what you do cannot work: free variables may appear in terms through reduction with that approach. (Remember that whenever a binder is destructed, e.g., with unbind, it must be reconstructed with bind_var.) It is actually important that the constructed binders are closed during matching, to ensure that no free variable escape. A solution to the problem will likely involve carrying around yet another environment. But I'm not sure I understand what variables it should contain.

fblanqui commented 5 years ago

We must remember the unbind'ed variables. Take

symbol T:TYPE
symbol a:T
symbol f:(T⇒T⇒T)⇒T
rule f (λx,λy,&F[x]) → a

When matching &F[x] against t, we must check that y does not occur in t.

Gaspi commented 5 years ago

One more unexpected behavior (in old syntax):

N : Type.
0 : N.

def f : N -> N.
[x] f x --> 0.

def g : (N -> N) -> N.
[h] g (x => h x) --> 0.

#EVAL[SNF] ( y => g (x => 0  ) ).   (;  λy, 0               ;)
#EVAL[SNF] ( y => g (x => x  ) ).   (;  λy, 0               ;)
#EVAL[SNF] ( y => g (x => y  ) ).   (;  λy, g (λx, y)       ;)
#EVAL[SNF] ( y => g (x => f y) ).   (;  λy, g (λx, 0)   !!! ;)

The first two SNF are correct, the third is the problem mentioned by @fblanqui . The last SNF command returns the first example which is not a normal form.

fblanqui commented 5 years ago

@gabrielhdt , your branch does seem to work, does it? The following file 225.lp (that you should use instead of ho_bug.lp) fails:

symbol T:TYPE
symbol a : T
symbol f : (T⇒T)⇒T
rule f (λx,&F[x]) → a
assert f (λx,x) ≡ a

Various other files in tests/OK fail too:

KO tests/OK/abstractions.lp
[tests/OK/abstractions.lp, 12:0-47] Assertion failed.
KO tests/OK/freevars-constraints.lp
[tests/OK/freevars-constraints.lp, 12:0-57] Assertion failed.
KO tests/OK/ho_bug2.dk
[tests/OK/ho_bug2.dk, 12:0-27] Assertion failed.
KO tests/OK/miller2.dk
[tests/OK/miller2.dk, 51:0-40] Assertion failed.
gabrielhdt commented 5 years ago

Ah indeed various test files are not ok. The file you provided earlier was and is ok.

fblanqui commented 5 years ago

Yes, but it was just a compute, not an assert.

gabrielhdt commented 5 years ago

I visually checked the result to be the comment you wrote beside the compute, that is \lambda x, a

fblanqui commented 5 years ago

@Gaspi , the 4th case is in fact the same bug. For computing the SNF, LP first computes the WHNF. When an application (t u) is in WHNF, then it computes the SNF of t and u. Here, because g(λx,f y) is in WHNF (wrongly), it computes the SNF of g and λx,f y, which are g and λx,0 respectively.

gabrielhdt commented 5 years ago

I have fixed my branch, tests pass and Francois' file passes

fblanqui commented 5 years ago

Great! Does it work with https://github.com/Deducteam/lambdapi/issues/225#issuecomment-513223702 too? Why not pushing this to your master?

In eval.ml, you can improve your code by not building forbidden, that is replace:

              let forbidden = VarMap.fold (fun _ sv acc ->
                  if Array.mem sv xs then acc else VarSet.add sv acc) to_stamped
                  VarSet.empty in
              let r = VarSet.for_all (fun sv -> not @@ Bindlib.occur sv b)
                  forbidden in

by something like:

              let exception Fail in
              let r =
                 try
                   VarMap.iter (fun _ sv ->
                      if not (Array.mem sv xs) && Bindlib.occur sv b then raise Fail) to_stamped;
                   true in
                with Fail -> false
gabrielhdt commented 5 years ago

The file iss225 embeds both example and both work. There is definitely room for optimisation, and exception seem to be a good solution. I have merged with my master.

fblanqui commented 5 years ago

Will be fixed by merging https://github.com/Deducteam/lambdapi/pull/172.

rlepigre commented 5 years ago

Let's not close until the bug has been fixed in master (I'll take care of that now), especially since #172 is not ready to be merged in my opinion. I'll review it right after I fix master.

fblanqui commented 5 years ago

Could you please explain why Gabriel's branch does not solve the problem? Do you have an example?

rlepigre commented 5 years ago

In fact I'm not sure, I'm still trying to understand what is happening. There are several issues going on at the same time. Solving the issue naively may lead to free variables escaping their scope if we are not careful. For instance, consider the following example.

symbol const T : TYPE
symbol const e : T

symbol get : (T ⇒ T) ⇒ T
rule get (λx, &f) → &f

// Should give [e]
compute get (λx, (λ _, e) x)

What used to happen here was that &f was set to be (λ _, e) x, which was obviously wrong since x escaped its scope, even though the result was still correct in the end (x eventually disappears through computation).

With the approach that I am considering now, this issue is solved. However, I now run into the problem that (λ _, e) x cannot match &f unless we first evaluate it to e. In general, we may need an arbitrary number of reduction steps, at arbitrary places, to ensure that "forbidden" variables disappear (and weak head normalization is not enough). I'm not sure what is the desired approach here. One solution would be to compute the (strong) normal form before matching, which would guarantee that every variable that can disappear does disappear. But that that's probably not "lazy" enough, is it?

Does Dedukti do anything special to enforce that free variables do not escape their scope @francoisthire? And is anything special done for that kind of case in your branch @gabrielhdt?

francoisthire commented 5 years ago

Maybe @barras or @Gaspi will give a better answer than mine. But in Dedukti you cannot write your rule. We have a restriction that the arity of every variable on the righ-hand side of the rule should be greater than any occurence of this symbol on the left-hand side.

But here, I think we try also to solve this problem by computing the snf of the argument if the matching failed beforehand.

gabrielhdt commented 5 years ago

In my branch, &f would be (\lambda _ e) x as you said. But I don't see why it is wrong. On the other hand, indeed unless we beta normalise, the term won't match since &f must contain no free variable.

rlepigre commented 5 years ago

@gabrielhdt it is very wrong because reducing get (λx, (λ _, e) x) (which is closed) will produce (λ _, e) x (which is open) as an intermediate step. The fact that x will here be erased after a step of beta-reduction is a pure coincidence, and if it would be very much possible that a variable escapes through this mechanism.

rlepigre commented 5 years ago

@francoisthire my example comes from a simplification of the ho_bug1.dk file (see here) that is used in our unit tests. The problem with this example (under my attempted fix) is essentially the same, so I don't understand why Dedukti would reject the above example and still allow the example in ho_bug1.dk.

fblanqui commented 5 years ago

Rodolphe, I believe that there is a misunderstanding about what Gabriel said. Gabriel, correct me if I am wrong but I think that you said that, currently, in your branch, get (λx, (λ _, e) x) does not reduce because (λ _, e) x contains x. If we were reducing (λ _, e) x before testing variables, we would get e and then, indeed, get (λx, (λ _, e) x) would be reducible to e because e does not contain x. So, you are right Rodolphe, this reduction strategy is not complete. But free variables are preserved: no bound variable will escape its scope. To fix this new problem (strategy completeness), we need to strongly normalize the argument before testing variables. Note however that this is necessary only when there are forbidden variables indeed, and this is very rare in practice. So, Gabriel, I am thinking that, in your decision trees, you should have a variable test only if there are forbidden variables like in Rodolphe's example. For instance, in f (λx,g(F[x])) → r, there is no check to do. Is it what you implemented?

gabrielhdt commented 5 years ago

That is indeed what I meant. However, this is what I thought I implemented, and in fact, I just found a bug in some auxiliary function. I'm fixing this.

Regarding the conditional test, I intended to do this, but some tests failed, namely the miller ones, because I had to check that variables are swapped in this test. Otherwise, decision on whether a test has to be done is done at compile time and takes into account the bracketed variables and the traversed lambda; it is done here.