harp-project / AML-Formalization

GNU Lesser General Public License v2.1
10 stars 5 forks source link

Translation of nameless proofs to named proofs using proven alpha-renaming? #362

Open nishantjr opened 1 year ago

nishantjr commented 1 year ago

I'm not convinced that there exists a naming scheme for variables that meets all 5 requirements in the report. This is because in the nameless representation we are reasoning modulo alpha-equivalence and is therefore more powerful than the named one. My suspicion is that proofs the make copies of patterns (e.g. KT) may cause problems. I've not tried too hard to come up with a counter-example though.

In particular, does there will always exist a direct (without some alpha-renaming) translation of the proofs from nameless to named? Even if it is possible, it looks trick and complex. Could we avoid the problem entirely?

One way to do this is to use a less direct translation, using proven alpha-equivalence at certain points, allowing us to drop requirement 3. For now, let us ignore metavariables and return to them later.

I hope I've not oversimplified things, and completely underestimated the problem.

Requirements

In order to translate proofs it is sufficient to have a translation function t: NamelessPattern -> NamedPattern, such that for any nameless proof rule,

        phi_1 ... phi_n
        ---------------
              psi

there is a named proof rule or theorem, such that

        t(phi_1) ... t(phi_n)
        ---------------------
               t(psi)

Further, we need that:

    eraseNames(t(phi_i)) = phi_i        (ERASURE),

so that we can translate proofs from the nameless to the named representation.

Candidate function for t

Let us define a stateless translation function, t: NamelessPattern -> NamedPattern. Let n_e: NamelessPattern -> EVar and n_s: NamelessPattern -> SVar be injective functions, e.g. returning a unique string representation of each distinct pattern. Define:

    t(exists phi) = (exists n_e(phi) t(phi[n_e(phi)/0]))
    t(mu     phi) = (mu     n_s(phi) t(phi[n_s(phi)/0]))

        where `phi[x/0]` names the previously bound variable to `x`.

All other patterns are translated in the obvious way.

For axioms that do not deconstruct binders as well as (ex-quan), (ex-gen), it is clear that t meets our requirements. This leaves us with (prop-ex-*), (pre-fixedpoint), and (knaster-tarski). that do not fulfill (ERASURE). Let us prove alpha-renamed versions of these. Conveniently, for any named pattern phi we have:

    eraseNames(exists x phi[x / y]) === eraseNames(exists y phi), and
    eraseNames(mu     x phi[x / y]) === eraseNames(mu     y phi)

when x is fresh in phi.

Using the Metamath proof rules we can also prove that:

    (exists x phi[x/y]) <-> (exists y phi)
and (mu x phi[x/y]) <-> (mu y phi)

when x is fresh in phi.

This allows us to prove an alpha-renamed versions of (prop-ex-), (pre-fixedpoint), and (knaster-tarski) that do* fulfill (ERASURE). (TODO: Paper proofs are in the final section, but I've not tried mechanizing these yet, so there may be errors.)

This technique should also work for axioms applied using the (hypothesis) proof rule in Coq, and for translating theorem statements and hypothesis, through congruence-of-equivalence theorems (Proposition 44 in the 2019 TR).

Dealing with metavariables.

We already have the metamath axioms (where ==> and <===> respresent provability in the #Substitution relation):

I think we may additionally need:

While this is a meta-theorem, I feel that this is a reasonable axiom to add, since it only talks about the meta-operation substitution and is only needed for metavariables, if it lets us move forward with the MM translation. I feel that we need to strike a balance between practicality and purity and if this axiom or similar ones are needed to respresent, or significantly shorten some meta-ish proofs in Metamath it is worth it. In the long run, as Xiaohong had mentioned, we can have a meta-level representation of matching logic's proof system itself to encode these theorems if necessary.

Alpha renaming in MM

We start from the goals and work backward.

Proof of (exists x phi[x/y]) <-> (exists y phi) when x is fresh in phi.

        (exists x phi[x/y]) -> (exists y phi)           by (proof-rule-gen)
<====   phi[x/y] -> (exists y phi)                      by (proof-rule-exists)
        (exists y phi) -> (exists x phi[x/y])           by (proof-rule-gen)
<====   phi -> (exists x phi[x/y])                      by (substitution-identity), (substitution-inverse)
<====   phi[x/x] -> (exists x phi[x/y])                 by (substitution-fold)
<====   phi[x/y][y/x] -> (exists x phi[x/y])            by (proof-rule-exists)

Proof of (mu x phi[x/y]) <-> (mu y phi) when x in fresh in phi.

        (mu x phi[x/y]) -> (mu y phi)                   by (proof-rule-kt)
<====   phi[x/y][(mu y phi)/x] -> (mu y phi)            by (substitution-unfold)
<====   phi     [(mu y phi)/x] -> (mu y phi)            by (prefixedpoint)
        (mu y phi) -> (mu x phi[x/y])                   by (proof-rule-kt)
<====   phi[(mu x phi[x/y])/y] -> (mu x phi[x/y])       by (substitution-fold)
<====   phi[x/y][(mu x phi[x/y])/x] -> (mu x phi[x/y])  by (prefixedpoint)
fiedlr commented 1 year ago

The idea of using pattern hashes for fresh variable names and the focus on the "rules". So you're saying that this local view could let us ignore thinks like commutativity of and, because the patterns used in the same rule have to be syntactically equal (so the hashes will be the same)?

Two other comments:

  1. We also need M |= phi iff M |= t(phi), ideally M, rho, m |= phi iff M, rho, m |= t(phi), i.e., to exclude options like t(phi) = bot.

  2. Are you sure t(exists phi) = (exists n_e(phi) t(phi[n_e(phi)/0])) is what we want? To be honest, I'm not sure what you mean by: phi[x/0] names the previously bound variable to x.

If it means that you simply subtitute x for 0 where it is unbound, from what I read at https://en.wikipedia.org/wiki/De_Bruijn_index , I am not sure if this works because 0 is relative: 0 does not point to the outermost binder, but to the closest binder. I understand that you want to substitute for the previously bound 0, but how do you handle other indices?

For example, exists 0 and (exists 0 1). The definition is top-to-bottom. So when you generate the name for the outermost index, you should also substitute for 1.

Just an idea: maybe what would work is bottom-to-top with decrementing on other indices like exists 0 and exists 0 1 ~> exists 0 and exists (n_e(exists 0 1)) (n_e(exists 0 1)) 1 ~> exists 0 and exists (n_e(exists 0 1)) (n_e(exists 0 1)) 0), so we would have intermediary results where index d points to the unnamed index with distance d. This way we could always replace for 0 because the lower levels have been transformed.

But I might just not understand De Bruijn indices because I never worked with them :-)

  1. So by the alpha renaming you just want to make sure that you use the variable based on the hash of the subpattern, like in exists-gen?

  2. mu x should be mu X

nishantjr commented 1 year ago

Yes, the main idea here is that the nameless proof system is more powerful because it reasons modulo alpha-equivalence. We may therefore need to use more complex proofs in the named version to recover this. Alpha-equivalence is not something that is "built into" the named proof system, but rather something that must be proved.

So you're saying that this local view could let us ignore thinks like commutativity of and, because the patterns used in the same rule have to be syntactically equal (so the hashes will be the same)?

Commutativity of and is something that needs to be proved. This translation doesn't take it for granted, and things like a /\ b and b /\ a will have different hashes.

We also need M |= phi iff M |= t(phi), ideally M, rho, m |= phi iff M, rho, m |= t(phi), i.e., to exclude options like t(phi) = bot.

That's a good point. I think (ERASURE) rules out t(phi) = bot specifically, but there could be other translations that do not preserve semantics, e.g. by mapping models to isomorphic models, e.g. N to N + N in the theory of the naturals.

To be honest, I'm not sure what you mean by: phi[x/0] names the previously bound variable to x.

I wasn't really sure what the correct notation for naming a variable in a nameless pattern is and just made one up. But perhaps this makes more sense: t(exists phi) = (exists n_e(phi) t( instantiate(exists phi, n_e(phi)))). Here, instantiate(exists phi, x) instantiates the outermost binder to the named variable x. For your example, instantiate(exists 0 and (exists (app 0 1)), x) = x and (exists (app 0 x)). I've not worked with De Brujin notation much either but I think there is already good definitions of this in this repo :-)

berpeti commented 1 year ago

I started implementing @nishantjr proofs from above with the named proof system in Coq. However, I found that the axioms substitution-fold and substitution-unfold cannot be derived for the standard definition of substitutions. I provide a counterexample below.

phi[x/y][psi/x] = phi[psi/y]     (for some fresh 'x')

To show that the equality above does not hold, we can use the following instances of the metavariables:

phi = mu z.z
psi = z -> z
x = x
y = z

With these instances, we simplify the definition of substitutions.

                                      shadowing             renaming
phi[x/y][psi/x] = (mu z.z)[x/z][z -> z/x] = (mu z.z)[z -> z/x] = mu y.y     (for some fresh 'y' which differs from 'z')

                            shadowing
phi[psi/y] = (mu z.z)[z -> z/z] = mu z.z

Therefore

phi[x/y][psi/x] = mu y.y =/= mu z.z = phi[psi/y]

I think, the same train of thoughts also applies for the provability of #Substitution in Metamath by using substitution-mu-shadowed for shadowing and substitution-mu for renaming; however, I do not know whether it makes the axioms inconsistent.

There were no issues with the existential case, because the axioms can be derived for renaming.

nishantjr commented 1 year ago

@berpeti I'm not sure if I've understood things fully, but if = here is intended to be syntactic equality, then renaming and substitution-mu seem overpowered here, and have alpha-renaming of fixedpoint patterns built in? If = is intended to be syntactic equality, then these are the rules with soundness issues, and not substitution-fold/unfold?

If so we should discuss replacing these rules.

berpeti commented 1 year ago

I double-checked the Metamath formalization of #Substitution. Let me rephrase the issue with it; it is possible to derive both #Substitution (\mu z z) (\mu z z) (\imp z z) z and #Substitution (\mu y y) (\mu z z) (\imp z z) z.

The first proposition is proved by the axiom substitution-mu-shadowed. For the second proposition, I provide a derivation tree (for readability I shortened #Substitution as #Sub, and omitted substitution from the rule names):

                                                 -------------- (var-same)  ----------------------- (var-diff)
                                                  #Sub y z y z               #Sub y y (\imp z z) x
-------------------------------- (mu-shadowed)    ------------------------------------------------- (mu)
  #Sub (\mu z z) (\mu z z) x z                          #Sub (\mu y y) (\mu z z) (\imp z z) x
---------------------------------------------------------------------------------------------- (fold)
                             #Sub (\mu y y) (\mu z z) (\imp z z) z

That is, (mu z.z)[z -> z/z] is in #Substitution relation with (at least) two results. Therefore, there is no function (i.e., deterministic relation) that can satisfy these axioms.

nishantjr commented 1 year ago

I see. That makes sense. The problem is that substitution-mu has alpha renaming built into the rule. I feel that this is not good because it doesn't separate concerns.

There are two directions we can take:

  1. Decide that #Substitution is an equivalence relation modulo alpha-renaming.
  2. Decide that #Substitution only deals with capture free substitution.

In either case, my feeling is that substitution-mu is too powerful, and should only deal with capture-free substitution. If we decide on the first approach, we should have an explicit axiom, alpha-renaming, and handle element variables and set-variables uniformly. In this case the alpha-renaming axiom can be used directly to bridge the gap between the named and nameless proof systems. Since it seems likely that we can prove alpha-renaming, I prefer the second approach (but I'm open to having an explicit alpha-renaming rule if it gets things moving). I propose we replace substitution-mu with the following axiom:

#Sub ph2  ph1[ ph0 / xX ]
-------------------------------------------------------------
#Sub  (\mu X ph2)  (\mu X ph1)[ ph0 / xX ]

from the current:

#Sub ph3 ph1[Y / X][ ph0 / xX ]
------------------------
#Sub (mu Y ph3) (\mu X ph1)[ ph0 / xX ]

In the original notation:

${
    $d xX X  $.
    $d Y ph0 $.
    substitution-mu.0 $e #Substitution ph2 phi1 ph0 xX $.
    substitution-mu   $a #Substitution ( \mu X ph2 )  ( \mu X ph1 ) ph0 xX $.
$}

For context, this is the current rule.

${
    $d xX X  $.
    $d Y ph0 $.
    substitution-mu.0 $e #Substitution ph2 ph1 Y X $.
    substitution-mu.1 $e #Substitution ph3 ph2 ph0 xX $.
    substitution-mu   $a #Substitution ( \mu Y ph3 ) ( \mu X ph1 ) ph0 xX $.
$}
berpeti commented 1 year ago

Actually, this issue is not specific to only mu quantifiers. In case of existential quantifiers, I managed to create a workaround by using renaming instead of substitution (the proof rules about existential quantifiers only require renaming).

I do not think, it is a good decision to change substitution-mu (and substitution-exists), because they follow the standard axiomatization of named capture-avoiding substitutions (e.g., Plotkin's work, page 3). In my opinion, it would be better to introduce the missing side conditions to these rules (based on Plotkin's work above).

nishantjr commented 1 year ago

Thanks for the explanation. That makes a lot of sense.

In case of existential quantifiers, I managed to create a workaround by using renaming instead of substitution

I don't understand what renaming is? Is that a proof rule in the named Coq formalization?

It seems that in the metamath formalization #Substitution defines a function up to alpha-equivalence, whereas the linked work (image included for reference) as well as the Coq formalization defines a function that also takes as input an ordered infinite list of variables to deterministically choose a fresh variable. I'm not sure that defining the second side condition in Metamath will be straightforward since we don't have a well-order over variables, only distinctness.

image

Lets discuss this is detail in the meeting next week.

berpeti commented 1 year ago

Renaming is defined by a function, just like the substitution. It is basically a special substitution that replaces a variable with a variable.

In Metamath I think, it is not needed to find exactly the variable specified in either of the definitions. The important bits are the side conditions of (1) and (2) that need to be captured, and the variable z could be anything that satisfies them.

Lets discuss this is detail in the meeting next week.

I agree.