Open ttuegel opened 3 years ago
In fact, the whole business of transitively-related symbols through anywhere
rules makes me inclined to declare that the backend isn't going to support this at all, but will support applying arbitrary matching logic rules so the user can define custom unification cases that way.
Would it be an option to make anywhere
rules work through rewrites instead of equality axioms? They would not work in predicates, but in the main configuration it would just mean an extra rewrite step every now and then. And then their symbols would just be constructors, as they were intended in the first place.
That would work operationally. How would the rules be encoded in Kore?
I would encode them as rewrites without any surrounding cells, and without any of the default things that are usually added, like ~> K
after the lhs, but I would add the anywhere attribute so that the Haskell backend knows how to apply them, including that it should lift the rewrites to the top after applying (or that it should pretend that it lifts those rewrites).
As far as I can tell, if you have a => b
as an axiom, then you can apply that rewrite in the middle of a term as long as you have only constructors above, i.e. for any constructor-based context C
, you have C[a] => C[b]
:
a => b is T
a -> next(b) is T
not(a) or next(b) is T
C(not(a) or next(b)) == C(T)
C(not(a)) or C(next(b)) == C(T)
// chapter 4.4 of The semantics of K
C(not(a)) or next(C(b)) == C(T)
// or both sides with not(C(a))
not(C(a)) or C(not(a)) or next(C(b)) == C(T) or not(C(a))
// C(T) or not(C(a)) == T
not(C(a)) or C(not(a)) or next(C(b)) == T
// It should be the case that, if C has only constructors over its argument, C(not(a)) -> not(C(a))
not(C(a)) or next(C(b)) == T
C(a) -> next(C(b)) is T
C(a) => C(b) is T
I'm not fully sure how that would interact with, say, all path proofs, but we would probably have similar problems if we use equalities.
I'm not fully sure how that would interact with, say, all path proofs, but we would probably have similar problems if we use equalities.
How this interacts with all-path reachability is certainly an unsolved problem. We do NOT have this problem with equality because the backend is allowed to assume equalities are confluent; if the user provides non-confluent equalities then the semantics is invalid and we may prove anything.
While writing the answer below, I just thought of something:
When translating to kore, we could split symbols like &
in two: a function part and a constructor part (let's call them f&
and c&
). We would translate anywhere
rules for &
as equations having the function form on both the LHS and the RHS, with an otherwise
case that translates the function form to the constructor form. On the LHS of any other rule (equation or rewrite) we would use the constructor form, on the RHS we would use the function form.
The actual answer:
I meant something which is similar to confluence, but slightly different: Normally, equalities work with all-path (also) because we use them for functions which get resolved to a canonical form made of constructor-like stuff, with which we can unify unambiguously, i.e. we always know which rewrite rules can be applied. However, for anywhere
rules, if there is a canonical form, then it may be based on the same (non-constructor) symbols it seems that if we're using anywhere
. I.e. we would have symbols which are constructors only if we consider some sort of equivalency class modulo (anywhere) equations, like &
in your example.
In other words if you have a rule saying rule a(10) => a(0) [anywhere]
, and you encode that as an equation, and your start configuration is C(a(10))
, then, in principle, you might have a rule saying rule C(a(10)) => stuff
, so you can't just try to apply rules to C(a(0))
and do unification like a
would be a constructor.
One option is to add some constraints over the semantics, @traiansf mentioned that Maude uses this: http://maude.lcc.uma.es/maude31-manual-html/maude-manualch5.html#x37-690005.3 (the paragraph before figure 5.1). Compared to confluence, that seems to me to be less intuitive, harder to write properly and harder to prove that it applies to one's semantics.
There are two other options which would not work in the general case, but may work for enough real-world cases. The first is to do unification modulo axioms, the second is to build all possible forms of the configuration (i.e. C(a(10))
and C(a(0))
), and only then treat a
as a normal constructor.
In my opinion, it may be better to start with how we want these to work. I assume that we want anywhere
rules to be unidirectional, which, by default, should mean either using rewrites, or using plain functions instead of anywhere. If we want them to be bidirectional, then the default option would be to just use functions, although making configuration equivalence classes would also be nice (e.g., as @traiansf explained to me once, it woulds also allow proper non-determinism for heating and cooling rules).
Another option we should seriously consider is treating anywhere
rules as theory transformations which we would apply at compile time. We would interpret a set of anywhere
rules,
cᵢ(Aᵢ) => Bᵢ // cᵢ is a constructor symbol. Aᵢ and Bᵢ are constructor-based patterns.
as equalities. We would require that the user-defined rules are orderable cᵢ > Σ(Aᵢ) ∪ Σ(Bᵢ)
so that we can apply the equations recursively to each other (let Σ(_)
be the set of symbols in a pattern). At compile time, we will find every cᵢ
and make the replacement:
cᵢ(φ) => ((φ ∈ Aᵢ) ∧ Bᵢ) ∨ (¬(φ ∈ Aᵢ) ∧ cᵢ(φ))
We will make this replacement on the left- and right-hand side of every rule, including the other anywhere
rules. (Thus, the ordering requirement.) We will also apply this transformation to every pattern and specification. From then on, we can consider every occurrence of cᵢ
to be distinct from Bᵢ
, so that cᵢ
is effectively a free constructor.
I like the theory transformation idea, but I'm hesitant about applying it on the LHS of the rules. Is there any reason to do so? If you just apply them on the RHS of the rules, then you'll know that during execution you're never generating a term that could have been simplified by anywhere
rules.
Applying them on the LHS of rules would maybe change user intent. Here is a really silly and contrived example:
syntax B ::= "true" | "false" | B "/\" B
rule true /\ B => B [anywhere]
rule false /\ B => false [anywhere]
syntax B ::= negate ( B ) [function]
rule negate(true /\ B) => negate(B)
rule negate(false /\ B) => B
If you simplify the lhs of the first negate
rule, then you get an infinite loop on negate(true /\ true) => negate(true /\ true) => negate(true /\ true)
. If you don't, then it simplifies in one step to negate(B)
.
So simplifying the LHS of rules with the anywhere rules may make them apply more (or less) than the user intended.
I think, BTW, that maybe the idea of coherence completion may be helpful here: https://experts.illinois.edu/en/publications/generalized-rewrite-theories-coherence-completion-and-symbolic-me . Specifically, the process of ordering the rules and inter-applying them to themselves sounds very similar to a part of the coherence completion process, which allows you to take non-free constructor definitions (non-free because there are equations over them), and simplify the theory to one only over the free variants of those constructors. At least, that's my (shaky) memory of the process. I may also be remembering some other process.
@sskeirik what do you think?
TL;DR: I'm not sure of all the details, but my summary is that I think narrowing with anywhere
rules will ruin completeness of narrowing, which is probably an unacceptable trade-off. Also, as I think about it, depending on how conditional equations are used, narrowing may also be incomplete.
Here are some more details about my definitions and important results:
The distinction between axioms, equations, and rules in the rewriting logic (RL) sense (which is not the same terminology used in the matching logic world). In my definitions below, I use the word standard equation to refer to equations in the typical mathematical sense.
L = R
) which, when oriented, are terminating and confluent as well as coherent with the axioms; we also require that L=f(t_1, ... t_n)
where t_i
for 1 <= i <= n
are constructor terms; we can also consider conditional equations L = R if C
which have similar notions of termination and confluence (but which are more complex to account for the condition)L -> R
or L -> R if C
when they are conditional) which we assume to be coherent w.r.t. our equations modulo axioms and where L
must be a constructor term --- in some cases, we restrict conditional rules to only use equations in the condition; this simplifies life slightly.f : s_1 ... s_n -> s
is defined whenever there is a set of equations whose LHS's cover all instances of f
f : s_1 ... s_n -> s
is a constructor if it is not defined. If the constructor has no equations or axioms which unify with it, it is a free constructor. If the constructor only unifies with axioms, it us a free constructor modulo axioms. Otherwise, it is a non-free constructor.the coherence --- (a) between equations and axioms and (b) between rules and equations modulo axioms --- are actually slightly different notions but relate to the idea that we can execute the different layers of the theory in order without losing completeness, i.e.,
there are important results about standard forward narrowing with unconditional equations which are confluent, terminating, and coherent w.r.t. axioms that guarantee completeness; to ensure that the narrowing process always terminates, we additionally require the finite variant property --- otherwise, narrowing in this case is complete but in general non-terminating (partial narrowings are sufficient to prove most properties)
there are some results about completeness of narrowing with conditional equations, but they are much more restrictive; I think a theory layering is required where the conditions in a conditional equations have a complete and terminating narrowing algorithm so that we can solve them --- otherwise, we cannot even reliably complete even one step of conditional narrowing
we also have results for standard forward narrowing with entire well-formed RL theories where completeness depends crucially on all RL rules being topmost (where well-formed means the RL rules, equations, and axioms satisfy the conditions above) --- where the rule only unifies with terms at the topmost position. I forget whether the finite variant property is also required for completeness. I forget what restrictions are required on the conditions of conditional rules to guarantee completeness.
unfortunately, completeness of standard forward narrowing is lost as soon as we allow narrowing at non-topmost positions; there is a complete algorithm for arbitrary narrowing which is much more expensive --- it involves narrowing bi-directionally if I understand correctly (from the start point forwards and the end point backwards)
Is there any reason to do so? If you just apply them on the RHS of the rules, then you'll know that during execution you're never generating a term that could have been simplified by
anywhere
rules.
This statement is false because of symbolic narrowing, and that is the reason to apply them to the left-hand side of rules.
If you simplify the lhs of the first
negate
rule, then you get an infinite loop onnegate(true /\ true) => negate(true /\ true) => negate(true /\ true)
. If you don't, then it simplifies in one step tonegate(B)
.
This was confusing to me; I think a clearer statement is that, after rewriting the left-hand side of the first rule, you get
rule negate(B) => negate(B)
which (obviously) does not terminate for any input. But, this is not only a problem with my theory-transformation plan; it's a problem with the equality encoding itself. Even now, while we encode anywhere
rules as equations, we would be justified to apply those equations wherever we like, including on the left-hand side of rules. So, thank you @ehildenb for pointing out that the encoding we have right now is already invalid.
I'm not sure of all the details, but my summary is that I think narrowing with
anywhere
rules will ruin completeness of narrowing, which is probably an unacceptable trade-off. Also, as I think about it, depending on how conditional equations are used, narrowing may also be incomplete.
@sskeirik I thought that what I suggested would not be considered narrowing per se because we are doing case analysis and keeping both cases (that is, we are partitioning \top
). That's what the backend does whenever it applies equations, so if that's not acceptable, then we have a very, very serious problem.
@ttuegel I thought I responded to your comment earlier. I've probably misremembered some details. If you do case analysis then it's always complete and its not narrowing --- and my whole comment doesn't apply. Wish I would have stopped to think a bit more before I wrote it.
Specifically, the process of ordering the rules and inter-applying them to themselves sounds very similar to a part of the coherence completion process, which allows you to take non-free constructor definitions (non-free because there are equations over them), and simplify the theory to one only over the free variants of those constructors.
@ehildenb Thanks for the paper! I have read it, but not fully digested yet. It did give me a clue about what is wrong with the way I proposed sorting rules. I should have proposed this:
Rᵢ ≤ Rⱼ if Rⱼ could never apply to any term in Rᵢ
This would exclude the following rule from your definition,
rule true /\ B => B [anywhere]
because this rule could apply to its own right-hand side for suitable B
. The required ordering is not consistent: this rule is not even less than or equal to itself. I think this system would also be rejected by the method in the paper you sent, but as I said, I have not fully digested it yet, so I could be wrong. In any case, and as I said in another comment, this system isn't compatible with encoding rules by matching logic \equals
, so either we must find another encoding, or find a reason to reject the system.
I have an alternative proposal to handle "anywhere" symbols which rests on well-trodden ground, theoretically speaking.
First, recall that a symbol is an "anywhere" symbol if it appears at the top of one or more "anywhere" rules. For every "anywhere" symbol c(...)
, we would generate two Kore symbols: a (free) constructor c{}(...)
and a total function c-anywhere{}(...)
. When the frontend translate K to Kore, it will translate c(...)
as:
c-anywhere{}(...)
when it appears on the right-hand side of a rule, in the ensures clause, on in the requires clause, or at the top of the left-hand side of an anywhere
rulec{}(...)
when it appears on the left-hand side of a rule, except when it appears at the top of an anywhere
rule.Every anywhere
rule will be translated to Kore as a function equation. The frontend will generate an owise
rule,
rule c-anywhere(...) => c(...) [owise]
which implicitly rewrites c-anywhere{}
to c{}
for any cases not otherwise specified by the user. I think this is perfectly sound theoretically because it just relies on functions and constructors, which are well-behaved.
@dwightguth Is it feasible to support this with the frontend? I can see some potential challenges. Will kprint
be confused by having a single K symbol correspond to two Kore symbols? Is it going to be a problem to generate this implicit owise
rule that only exists in Kore (without a corresponding rule in K)? Are there other risks that I don't see?
@ttuegel I'm glad that you liked the solution proposed at the start of https://github.com/kframework/kore/issues/2405#issuecomment-780444275 , I didn't get feedback on it so I was having doubts that it was actually a good thing.
I didn't get feedback on it so I was having doubts that it was actually a good thing.
Sorry, I needed some time to digest it! :sweat_smile:
The backend does not support symbols with the
anywhere
attribute. These symbols are like constructors, except they do not have no-confusion axioms with certain other constructors, based on user-definedanywhere
rules. Because certain no-confusion axioms are missing, we cannot unifyanywhere
symbols as constructors, or indeed, at all. It is not our goal to supportanywhere
symbols, but here is what would be required:no-confusion
axioms between pairs of symbols related byanywhere
rules, but to include the others. Note that the relation between symbols is transitive, i.e. ifa
is related tob
, andb
is related toc
by anywhere rules, thena
is also related toc
.no-confusion
axioms) which constructor-like symbols are not subject to confusion, instead of requiring that all constructors be free constructors.no-confusion
axioms based on the user-definedanywhere
rules, and modify the backend to apply these as needed.anywhere
rules during unification to symbols that may be confused