Open tirix opened 2 years ago
The proof checker itself does not need those, and it shall not be impacted.
I've built the Formula
around Atom
s, which are basically just repurposed u32
, so my need would be around assigning new label Atoms which could be used in formulas, and possibly printed.
I see two possible approaches:
Nameset
to do so, which is responsible for allocating Atoms. The user would also provide a Token
(Box<[u8]>
) for the name, and would have to keep track of the work variables it has created, since there is no obvious way of distinguishing them from regular variables.u32
for work variables, or use an enum
. The advantage would be that it is then immediate to check if a variable is a work variable. The drawback is that this would add a performance cost to all methods involving Atoms. A simple lookup would turn into a test + lookup.Note that for each work variable, we would possibly need two new Atoms:
Symbol
Atom) for resolving a textual representationLabel
Atom) for use in formulasIn both cases, it would not be possible to associate a StatementAddress
or we would need to associate the default StatementAddress
in order to be able to use lookup_float
, lookup_symbol
or lookup_label
.
I think we should not use Formula
for formulas with work variables or other "proof assistant" features at all. This is overhead for everything that handles the completed formulas, and it would definitely show in the verification time of set.mm. Instead, use a special formula type for expressions with metavariables, and loading a theorem into the proof assistant entails converting from the Formula
representation to this expanded type. I'm not even sure whether it needs to be part of mm-knife; at least, it can be a separate module that doesn't interact with other things, but it could also be a separate dependent crate.
Thanks for your quick feedback!
Actually, Formula
is not used at all in the proof verification, metamath-knife's proof verifier works directly on textual tokens. As a matter of fact I've only introduced grammar parsing and formulas for the sake of unification in a proof assistant.
Though, I believe that once one has a parse tree for all formulas in the database, proof verification might possibly be sped up significantly, since a substitution can be performed in constant time, and best-case comparison of different formula is also constant time if the parse tree root differs.
That was part of my project in the somewhat orphaned #63 : building an abstract proof verification, which could be used both on textual representation of expressions (like currently), or on formulas, or any other support.
I'll give it some thought, as it sounds like the best way to implement this is inheritance and overriding, but rust usually forces to think differently: here, Formula would typically become a trait
. What about making the "Label" a generic type in formulas?
If you do have a type for this, you should not mix constants, variables and metavariables. From a proof assistant's perspective, variables act like constants and metavariables act like variables, because only the latter are subject to substitution. The distinction between constants and variables still matters however, because disjoint variable requirements are expressed in terms of the variables contained in the expression (and you need to have some higher order thing in order to handle disjoint variable requirements for unresolved metavariables properly; mmj2 just defers the check until the metavariable is completely assigned).
Regarding "inheritance", I would just have a From<Formula> for MetaFormula
implementation like
match x {
Formula::Const(x) => MetaFormula::Const(x),
Formula::Var(x) => MetaFormula::Var(x),
}
where Formula
is an enum with two variants and MetaFormula
is an enum with three variants. There is also the matter of whether the proof assistant is using trees or string representation for formulas; you can implement metaformulas either way but it affects the way you write things quite a bit.
From a proof assistant's perspective, variables act like constants and metavariables act like variables
Indeed, interesting.
I would just have a
From<Formula> for MetaFormula
implementation like...
With Formula
as a tree, and variables hidden inside, I don't see it work this way. Right now I'm rather thinking about a
struct Metaformula {
formula: Formula,
metavariables: ...
}
Where metavariables
would allow to identify where the metavariables are (or "work variables" in MMJ2 parlance).
But this does not solve the fact that I still need fresh metavariables inside my Formula
structure.
There is also the matter of whether the proof assistant is using trees or string representation for formulas; you can implement metaformulas either way but it affects the way you write things quite a bit.
I've been using the metamath-knife Formula
type, which is a tree.
MMj2 represents formulas as trees of statements (the parse tree), and work variables are WorkVarHyp extends VarHyp
. So it is as if you had some other kind of my-work-var $e
statement you can use to build up your parse tree.
As another point of comparison, in MM0 expressions are represented completely differently in the proof assistant and in the packed version in the environment. The latter is an Expr
type which is a tree of term applications to variables, while the former is a LispVal
which is just a lisp s-expression, passed around by the user code, where things like (foo a b c)
are interpreted as an application of the expression constructor foo
. Metavariables are special lisp values #<mvar i>
that can be constructed through builtins (but cannot be confused either with atoms like x
which represent variables, or lists like (foo)
which represent constants).
Regarding your last edit: metavariables are allocated and their assignments tracked by the proof context (global to the whole proof, not just the one expression). So there is no need for a metavariables
field in Metaformula
like you wrote. I wouldn't do a detailed tracking of the metavariables in the formula other than perhaps a boolean flag to indicate if any metavariables are in the subtree (this can speed up metavariable instantiation).
Regarding your last edit
Sorry this was misleading, of course metavariables shall be tracked at the level of the proof context. What I meant there was a kind of boolean flag to indicate where in the subtrees there are metavariables, as a kind of local cache/speedup, quite like you explain in your last sentence.
If I remember correctly, MM0 has a very rich Lisp syntax which includes not just expressions and metavariables, but also instructions/commands. I assume that this is what LispVal
corresponds to. It looks like it boils down to a kind of enum
where one option is to itself be a list, so this is also a kind of tree structure, right?
Yes, LispVal
is the set of lisp values (i.e. anything you can get by evaluating a lisp expression), so it also includes things like closures, strings and other stuff. It's similar to what you would find in the Value
type of dynamic languages like Javascript or Python (or Lisp). This is used for all metaprogramming activities in the language. But when you pass one of these objects to a theorem proof as in theorem foo: $ T. $ = <lisp code here>;
, the value is interpreted as a MetaFormula
which means that most of those other kinds of values would cause a type error.
The grammar of allowed expressions looks something like this, which you should think of as a subset of LispVal
(this enum doesn't actually exist, it is for illustration purposes):
enum MetaFormula {
Atom(AtomId), // variable/hypothesis reference
List(AtomId, Vec<MetaFormula>), // term/theorem application
MVar(MVarId), // term metavariable
Goal(GoalId), // proof metavariable
}
So yes, it is a tree structure, with explicit variants for metavariables. (There is also a bit more than this, for (:conv e c p)
conversion proofs, but those don't come up in metamath.)
Here is the actual code that does the parsing of LispVal
into ProofHash
objects, which eventually become ProofNode
which is the storage format for completed proofs.
Note that in this kind of setup, metavariables can end up hanging around in the term even after they have been assigned, because there is no explicit list of all expressions everywhere which we can iterate over to eliminate all occurrences of the metavariable. (And even if there was, this would be quite expensive when you are creating and assigning lots of short-lived metavariables.) Instead, you just remember that the metavariable is assigned, and make sure that anything you do with expressions treats assigned metavariables as equivalent to the expression that they were assigned to.
MMj2 represents formulas as trees of statements (the parse tree), and work variables are WorkVarHyp extends VarHyp. So it is as if you had some other kind of my-work-var $e statement you can use to build up your parse tree.
This is the method I'd like to use. There will still be another layer of meta variables on top of that (like goal
, at tactics language level), but metamath-knife will not know about them at all because they will be evaluated before any call to metamath-knife API.
Nevertheless, the work variables / metavariables are not evaluated, but are to be resolved through unification - just the one that metamath-knife shall provide.
It looks to me like the best way to do that would be to use a generic data type L
to use with Formula<L>
, where L
will usually be Label
when created from the grammar, but where it might also be a user specific type optionally including meta-variables. I believe I would still keep FormulaRef
as using Formula<Label>
internally. There shall be no runtime cost. This shall allow to have this work variable / meta variable mechanism without major impact in metamath-knife.
This would be great! I find work variables quite helpful.
I'm currently slammed on other things, but I'm hoping to become more present for Metamath stuff in the nearer future. Until then, if I'm a blocker for something let me know, I do not want to inhibit the awesome stuff everyone else is doing.
So PR #89 is a proposal to introduce support for work variables, whereas they are supplied externally as suggested by Mario.
In MMJ2, whenever applying a theorem leaves unresolved substitutions, like when applying
~addcan2ad
in the example below, work variables such as&C1
are created.Coq calls them existential variables, and it looks like Lean calls them metavariables.
It would be nice if
metamath-knife
could support this functionality.