Proposal for representing sort injections

ana-pantilie commented 1 year ago

@goodlyrottenapple says:

we could perhaps get rid of them at the Term level and instead push them into the Sort type? I think this would make code like the matcher and unifier easier to read,

This is an intriguing idea! :+1: I also believe it will be easier to write the unifier/matcher if we get it right (see below) Might be worth its own issue, as we need to think this through a bit more. Writing down some ideas here for now.

we would only be matching on actual structure, such as constructors or values and we could handle the matching/unification of the types of both terms in a separate function. In theory, it could be as simple as having [Sort] in Terms, where e.g.

DomainValue [s1,..sn] x ~ inj{s2, s1}(inj{s3,s2}(...inj{sn, sn-1}(\dv{sn}(x)) ...))

IIUC that would mean generally transforming terms with such inj chains into a single flat constructor with an ordered (wrt. subsorts), non-empty list of sorts (called sort-chain below), at the time when we convert from text/json to the internal representation (and vice versa). inj{s2,s1}(inj{s3,s2}(<constructor>(x)) becomes <constructor'>{s1 :| [s2, s3]}(x) where :| is a non-empty list constructor, and s3 subsort s2 subsort s1
The ordering property would not really matter to us, we assume the injections are correctly-arranged.
If I understand correctly, a rule match with such sort representation would match a subject term's sort-chain's prefixes to the sort-chain of the pattern term on the LHS. The match would quite probably need to reflect the actual prefix that was matched, as we want to replace that part by whatever the RHS makes from it (if it is a variable), _leaving the suffix intact for subsequent matches).
I am unsure how this would interact with "shortcut injections" where another component removes the middle sort from a chain like the above (turning inj{s2,s1}(inj{s3,s2}(<constructor>(x)) into inj{s3,s1}(<constructor>(x)) - which is apparently something that the LLVM backend does.

Originally posted by @jberthold in https://github.com/runtimeverification/hs-backend-booster/issues/4#issuecomment-1301485695

ana-pantilie commented 1 year ago

@ehildenb and @dwightguth , thoughts?

ehildenb commented 1 year ago

@ana-pantilie I really recommend you do not try to deal with sort injections yet. For 2 reasons:

Only 1 rule in 10 on KEVM actually has to do anything with subsorts, everything else is directy. So you can fallback to the old rewriter for those steps.
If you treat the sort injections in the same way we treat AC matching (that is, return "don't know"), we'll be able to amke immediate progress and then see the types of problems that come up.

So I would propose this approach:

If sort injection between the two patterns is exactly the same (which I suspect will be the case many of times, because the same algorithm is being used by the frontend to insert sort injections in both the definition and the term), then treat it like a free constructor and continue unification below it.
If they are different, just return the equality { T1 #Equals T2 } (where one or both terms have a sort injection at the top).

This allows our new unification algorithm to be "safe", while making a minimal attempt to deal with sort injections. If we find that this is not enough, then we will be able to collect data about how often we are falling back to the old backend due to sort injection issues.

ana-pantilie commented 1 year ago

then treat it like a free constructor

I guess that means we should have a separate node in the AST for it, at least for now, right @goodlyrottenapple @jberthold ?

goodlyrottenapple commented 1 year ago

maybe not even necessary for now if we treat them as free constructors?

jberthold commented 1 year ago

To re-iterate the in-person discussion from yesterday: What we should think about at some point is what it means to "short-cut" the injections. E.g.., if you have Int < IOInt < KItem and a term inj{IOInt, KItem}(inj{Int, IOInt}(42)), IIUC the term construction would be cutting out the inj(Int, IOInt) and we have inj{Int, KItem}(42). Now if you have rules that match on things containing IOInt, for the rule to apply it would have to accept the Int as an IOInt How would the subsort relation Int < IOInt be recovered?

ehildenb commented 1 year ago

I'm pretty sure injections can be treated as total function symbols basically. They have the properties: single-valued, injective, and total. In particular, total functions satisfy weaker properties than that: single-valued and total. Understanding that they satisfy the same axioms as total functions, lets us know immediately that:

They don't affect preserves-definedness.
They can be safely treated as total-functions by unification.
The user can supply simplification rules that deal with them, instead of the backend handling them.

User supplied simplification rules may not be performant enough long-term, but we don't know yet. For a subsort chain A < B < C, we could have the frontend (or user) supply the needed simplification rules for expanding/contracting the subsort chains.

This is why I'd lean towards not having special representation or treatment of them initially, similar to how we are not treating Maps specially.

I think the biggest problem we'll run into is that the unifier will not be returning the correct results, or the best results, in some cases. I think this means we need to adjust our rewriter to not only check that the returned unification problem is a matching substitution, but also needs to return a well sorted substitution. That is, for each unificand { Vi #Equals Ti }, we should have that Sort(Ti) <= Sort(Vi), where we maintain the subsort relationship table elsewhere (and compute it quickly once at load time). We should probably also update each unificand with the necessary sort injections that we discover. This seems like a very low-overhead way to work with injections in our current design.

That modification should allow us to handle the most common case, which is where you have a single variable in a rule which is a broader sort than the subject it's matching (so it catches lots of sub-cases), and the actual term is a narrower sort.

ehildenb commented 1 year ago

The other common case, of course, is the complete opposite. You have a variable of a more specific sort than the term it's matching, but it's still a matching substitution. The well sorted check will correctly filter out that possibility.

ehildenb commented 1 year ago

The more I think about it, the more I think we just should not send terms with injections to the new backend, and it should work in the order-sorted setting instead of the many-sorted setting.

Maude definitions must satisfy the "preregular" property (see Section 3.8 of the Maude Manual), which is basically the most simple sanity check on the signature supplied by the user which ensures that the order-sorted-ness behaves well. Preregularity is only really a problem if you have operator overloading.

So the new rewriter can check for overloading first, and reject any definitions that have overloading. Overloading is not a commonly used feature, and indeed it's tricky to use and get correct in K definitions. It may be that some K definitions get nicer if we have better overloading heirarchy, but it has not been needed yet. KWasm is one example of a semantics that uses overloading, I think it can be worked around there with a little rule duplication.

Order-sorted is not much more complex than many-sorted, and as long as you validate that all terms that come in (and rules/definitions) are well sorted, you can largely ignore the order sorted component. When doing unification, you need to check that the substitutions are well sorted, and fast unification algorithms that handle order-sorted definitions exist (eg. Maude, which can be used as a library over FFI to do unification problems for us if we get to the point where we decide that AC unification is needed for the fast rewriter).

The 2 things about K definitions is that:

Many of them don't really use very much order sortedness, and even less overloading.
The KLabels (and Kore Symbols) we generate for K definitions contain the exact sort of their inputs in the names of the labels themselves. So if we do nothing to handle the order-sorted component, and we have no overloading (which we already reject above), then direct constructor unification should still work correctly for order-sorted terms and we shouldn't have to do any extra sort-checks except on the final unificands we get. It may be the case that this sort check throws out some more rules (for KEVM it should throw out 1 rule per opecode cycle, which is about 10 execution steps).

So basically what I'm proposing is that when we feed Kore into this backend, we don't insert sort injection nodes, instead just interpreting the Kore term as an order-sorted term. I think because we're only trying to handle the 90% case, we'll find that this does not limit us severly and makes us not have to worry about injections at all. It is just a few lines of code to change the pyk library to emit this order-sorted Kore instead of many-sorted Kore (and to have it able to emit both, so it can still speak to the old backend), and this backend can strip all the injections from the incoming definition when it loads it.

Adding support for overloading later is also not difficult I don't think, but it does incur more sort and subsort checks during unification I believe.

runtimeverification / haskell-backend

Proposal for representing sort injections #3798