[K-Improvement] Better handling of partial symbols

Introduction

K functions can be total or partial. For partial functions, the semantics may contain rules for evaluating their #Ceil expressions. The Haskell backend should not send partial terms to Z3 except for special cases (though it seems to do so right now). This document will attempt to show how can users handle most partial functions in a way that works well with the current symbolic execution infrastructure, and how K can kelp with that.

Example

Let's say that a user defines a partial function f and uses it in the requires clause of some rewrite rules:

  syntax Int ::= f(Int, Int)  [function]
  rule f(I, _) => 0 requires I ==Int 1
  rule f(I, _) => I +Int 1 requires I =/=Int 1 andBool I =/=Int 0
  rule #Ceil(f(I, _)) => I =/=Int 0  [simplification]

  syntax KItem ::= "a"
  rule I:Int => a requires f(I, I) <Int 0
  rule I:Int => a requires f(I, I) >=Int 0

Since f is not defined in 0, these rules may not fully rewrite a configuration that has an int at the top of the k cell. As an example, the Haskell backend will not prove the following claim:

claim _Value:Int => a

However, the backend will not proove the claim even if the user adds this rule:

  rule 0:Int => a

The user may attempt to constrain the rules to cases where f is defined, but that still does not work:

  rule I:Int => a requires f(I, _) <Int 0 andBool I =/=Int 0
  rule I:Int => a requires f(I, _) >=Int 0 andBool I =/=Int 0
  rule 0:Int => a

Option 1 - Send potentially undefined terms to Z3 when defined

In practice, the side condition for most configurations implies that all of the configuration's subterms are defined and that most of the subterms showing up in the side condition are defined. This would seem to suggest that it should be safe to send those undefined terms to z3, especially if the side condition that implies their definedness is included. However, this is probably not always the case, e.g. (#Not {true #Equals f <Int 0}) #And #Not {true #Equals f >=Int 0}, which looks like a remainder predicate from a split, is unsatisfiable if f is total and is satisfiable otherwise, while Z3 would say that both are unsatisfiable. Also, as shown in issue 36031, special care should be taken when translating partial functions to z3 builtins (or z3 functions in general).

Option 2 - Rewrite partial functions to total ones

We can also define a total function for each partial function in the semantics (or, at least, for each partial function about which we care). Continuing the example above, the user could define

syntax Int ::= fTotal(Int, Int) [function, total, no-evaluators]

We can imagine fTotal to be the same as f in all places where f is defined, and being an unspecified value (or 0, or whatever) for all points where f is not defined. Then, in most real-world cases, we can replace f with fTotal like this:

rule f(X, Y) => fTotal(X, Y) requires true #And #Ceil(f(X, Y))  [symbolic(X), simplification]
rule f(X, Y) => fTotal(X, Y) requires true #And #Ceil(f(X, Y))  [symbolic(Y), simplification]

And, in order to evaluate fTotal whenever that is possible, we can do this:

rule fTotal(X, Y) => f(X, Y) requires true #And #Ceil(f(X, Y))  [concrete, simplification]

However, the requires clauses above are kind of awkward, so the user may also define

syntax Bool ::= definedF(Int, Int)  [function, total]
rule definedF(I, _) => I =/=Int 0

and then we can finally write

rule f(X, Y) => fTotal(X, Y) requires definedF(X, Y)  [symbolic(X), simplification]
rule f(X, Y) => fTotal(X, Y) requires definedF(X, Y)  [symbolic(Y), simplification]
rule fTotal(X, Y) => f(X, Y) requires definedF(X, Y)  [concrete, simplification]
rule #Ceil(f(A, B)) => {true #Equals definedF(A, B)}  [simplification]
// The rule above is incomplete. We could also write the following if
// the Haskell Backend could handle it:
// rule #Ceil(definedF(@A, @B))
//      => #Ceil(true #And definedF(@A, @B))}
//         #And #Ceil(@A)
//         #And #Ceil(@B)
//   [simplification]

Problematic rule from 13/7 meeting:

#ceil(f(@A, @B)) = {definedF(@A, @B) #Equals true} #And #Ceil(@A) #And #Ceil(@B)

Fixed version:

#ceil(f(@A, @B)) = #Ceil(definedF(@A, @B) #and true) #And #Ceil(@A) #And #Ceil(@B)

Note that the fTotal and definedF symbols could be defined automatically by kompile. Also, the three translation rules and the #Ceil rule can be defined automatically. The only thing left for the user to do would be to specify what definedF does, and the user could skip doing that if it makes sense (though kompile would need to add the no-evaluators attribute to the symbol) To summarize the proposal: The following declarations:

  syntax Int ::= f(Int, Int)  [function]
  rule f(I, _) => 0 requires I ==Int 1
  rule f(I, _) => I +Int 1 requires I =/=Int 1 andBool I =/=Int 0

  syntax Bool ::= definedF(Int, Int)  [function, total]
  rule definedF(I, _) => I =/=Int 0
  rule #Ceil(f(I, J)) => definedF(I, J)  [simplification]

  syntax Int ::= fTotal(Int, Int) [function, total, no-evaluators]
  rule f(X, Y) => fTotal(X, Y) requires definedF(X, Y)  [symbolic(X), simplification]
  rule f(X, Y) => fTotal(X, Y) requires definedF(X, Y)  [symbolic(Y), simplification]
  rule fTotal(X, Y) => f(X, Y) requires definedF(X, Y)  [concrete, simplification]

  syntax KItem ::= "a"
  rule I:Int => a requires f(I, _) <Int 0 andBool I =/=Int 0
  rule I:Int => a requires f(I, _) >=Int 0 andBool I =/=Int 0
  rule 0:Int => a

will make this claim pass:

  claim _:Int => a

The method above can be used by any K user who is willing to write total representations of partial functions as mentioned above. Probably users could also share modules that create total representations of partial functions and operators in domains.md.

K change suggestions

The K team could provide a better way of doing this: For each partial function (including operators), kompile should

define a total version of the operator a. Whenver the total version is too complicated to produce automatically ( e.g. what would be the name for the total version of Map[KItem]? Can the name be compited automatically always? Can the smtlib attribute be produced automatically?) users could define their own version of the function/operator, and link it to the main symbol using an attribute, e.g. one of the following:

    syntax Int ::= Int "/Int" Int
        [function, hook(INT.tdiv), klabel(_/Int_), symbol, total_form(_/IntTotal_)]
    syntax Int ::= Int "/IntTotal" Int
        [function, total, klabel(_/IntTotal_), symbol, smt-hook(div), no-evaluators]

    syntax Int ::= Int "/Int" Int
        [function, hook(INT.tdiv), klabel(_/Int_), symbol]
    syntax Int ::= Int "/IntTotal" Int
        [function, total, total_form_of(_/Int_), klabel(_/IntTotal_), symbol, smt-hook(div), no-evaluators]

b. Users could also specify a symbol to be used for the total version:

    syntax Int ::= Int "/Int" Int
        [function, hook(INT.tdiv), klabel(_/Int_), symbol, total_symbol(total/Int)]

and kompile could define the following automatically:

    syntax Int ::= "total/Int" "(" Int "," Int ")"
        [function, total, klabel(total/Int), symbol, no-evaluators]

Kompile could also generate a defined... symbol in a way similar to the total... symbol above, e.g.
```
syntax Bool ::= "defined/Int" "(" A "," B ")"  [function, total, defined_for(_/Int_)]
```

Kompile could generate the translation rules:

rule A /Int B => A /IntTotal B requires defined/Int(A, B)  [symbolic(A), simplification]
rule A /Int B => A /IntTotal B requires defined/Int(A, B)  [symbolic(B), simplification]
rule A /IntTotal B => A /Int B requires defined/Int(A, B)  [concrete, simplification]

Kompile could generate the #Ceil rule in one of its possible forms, e.g.:

rule #Ceil(A /Int B) => {true #Equals defined/Int(A, B)}  [simplification]

Whenever they need it, users would only write equation rules for the defined... symbol

Maude has the notion of "Kind" for each connected component of sorts, where any undefined function becomes an element in the term algebra the "Kind" (which is basically a supersort of that connected component of sorts, with all the additional terms as elements that are not defined for lesser sorts). Maybe this is somehow similar, or could reuse some of that idea too?

See section 3.5 of the Maude manual: https://maude.cs.illinois.edu/w/images/9/9d/Maude-3.3.1-manual.pdf

Disclaimer: I don't know much about Maude, so I may have misunderstood some things.

This is interesting.

Yes, it seems that there are similarities between kinds and the ideas above.

About reusing that idea:

First "connected components of sorts", whatever that means, might not make sense in K. I guess that most ideas that are based directly on the Maude kinds involve defining for each sort A, a sort/kind [A] that looks something like syntax [A] ::= A | "bottomA". FWIW, I think that bottomA is needed in the definition, one can't just use syntax [A] ::= A.

Second, right now a partial function f from A to A can be used in places where an A value is needed. This has its good and its bad parts, but that's not what's important here. The important part is that a function whose value is an [A] can't be used as an A value. So, if partial functions are implemented as total functions on "kinds" or if we want to substitute them for total functions on kinds as described above, then each symbol that currently takes an A as an argument would need to actually take [A] instead.

Third, if a symbol (of sort S) takes an [A] as an argument, if we want to have a semantics equivalent to what we currently have, then the symbol must take bottomA to bottomS. However, this would create problems for constructors: A constructor defined as syntax S ::= c(A, B) would need to be redefined as syntax [S] ::= c([A], [B]) and we would need to have bottomS ==S c(bottomA, _:[B]) ==S c(_:[A], bottomB), which would mean that constructor axioms would need to be modified to take bottom* into account. This means that K constructors would no longer be the same as ML constructors.

Fourth, since a symbol's behaviour involving bottom* arguments is fixed, it would probably be a good idea to allow users to pass bottom* to function symbols only in simplification rules, and not in function definition rules.

That's all I got so far when I tried to figure out how kinds would work in K. I'm not sure it's better that what I proposed above, but it may improve with more design work.

w.r.t. to the proposed solution, could we not simply do the following?

1) for any function not marked total, the frontend would require the user to supply a ceil/defined predicate. the syntax for this is not really important. 2) then, all rules for the given partial function of the form

   rule f(...) => rhs_1
   ...
   rule f(...) => rhs_n

would be transformed by the frontend into

   rule f(...) => rhs_1 ensures defined(f(...))
   ...
   rule f(...) => rhs_n ensures defined(f(...))

given that in the booster, we are only really interested in executing while the configuration remains defined, and keeping it so after the rewrite, this should be sufficient?

@goodlyrottenapple I don't know much about the booster (I keep hearing about it, but I'm still not sure exactly what it does), but the rules you mentioned seem to be rules that evaluate f.

One of the issues I was trying to solve above was that it may not be possible to evaluate f while doing symbolic execution, while I may want to be able to, say, send expressions containing f to z3 if the side condition implies that the term is defined.

Also note that f may only occur in the side condition, so it may be undefined even if the configuration is defined.

So the major restriction in the booster right now is that it does not deal with definedness, namely we assume that the configuration we were given is defined (we do this check separately via the old backend I believe) and at every step of rewriting/function application/simplification we maintain this invariant. Hence, we cannot apply any partial functions atm. My thinking on this issue is, how can we introduce applications of partial functions whilst maintaining the above invariant. If we are only interested in executions where everything stays defined, is my reasoning correct in assuming all we need to do when applying some partial function f is to add the constraint of definedness for that function call f to the list of predicates of the current configuration?

Sorry, I forgot to answer this. I hope that my answer below makes sense.

My short answer is that, if f has no ML connectives above it (e.g. not something like #Not(f(...))), then a rewrite/simplification/function definition rule LHS => RHS requires R ensures E can be thought of as one or two rules (and it can probably be implemented this way):

LHS => RHS requires definedF(...) andBool R ensures E
LHS => #Bottom requires notBool definedF(...) andBool R (only if f occurs in RHS or E, but not in R)

If f has ML connectives above, then it's complicated, and I will not attempt to figure out what happens in this reply.

My long answer (assumes that f has no ML connectives above it and that all partial functions g have definedG total functions with full definitions):

Rewrite rules

If f shows up when applying a rewrite rule, I think it's usually easier to consider a configuration split of the initial configuration based on the isDefined(f) predicate. On the branch where isDefined(f) is true everything should work as expected. On the branch where isDefined(f) is false, then the following should happen, based on where f occurs in the applied rule:

If f shows up in the requires clause, the rule does not apply.

Otherwise, if f shows up in the term part of the RHS, then the entire configuration goes to #Bottom, ending the proof/exploration.

Otherwise, if f shows up in the ensures clause, the configuration goes to #Bottom

Function definition rules

When f occurs in a function definition rule, then the case analysis is similar to the rewrite rules one, except that, perhaps, we should handle branches with #Bottom results somewhat differently. The main issue is that the user could, in principle, write something like:

rule myF(X) => f(X)
rule myF(X) => g(X)

If f and g cannot be defined at the same time, e.g. f is defined when X <= 0 and g is defined when X > 0, then this is a valid definition for myF. FWIW, I don't know if this works with the current Haskell backend or not.

The rules above would be transformed to:

rule myF(X) => f(X) requires definedF(X)
rule myF(X) => #Bottom requires notBool definedF(X)
rule myG(X) => g(X) requires definedG(X)
rule myG(X) => #Bottom requires notBool definedG(X)

Anyway, if we assume that myF was defined before applying these rules, then we can ignore the rules that produce #Bottom and we can just use:

rule myF(X) => f(X) requires definedF(X)
rule myG(X) => g(X) requires definedG(X)

Function simplification rules

If f occurs in the LHS term, then f should be defined can be ignored. If f occurs in the requires clause R, we can just add andBool definedF(...) to R.

If f occurs in the RHS term or the ensures clause E, then we can assume f is defined, since simplification rules should preserve definedness.

So, for simplification rules, the only interesting case is when f occurs in the requires clause R.

Summary from 13/7 meeting is:

The booster (@goodlyrottenapple , @jberthold) need to figure out how the booster wants to handle partial functions.
Frontend support for non-#Ceil definedness conditions over Bool is needed (rather than ML-side) - most important to allow future prototyping
Where do we want to implement any required transformations? Frontend or in the booster?

Next steps:

SMT support in the booster is incoming and will require some version of this to be implemented (next ~months timescale)

FWIW, you can always use:

f(X, Y) === fTotal(X, Y) #And {true Equals definedF(X, Y)}

so, whenever you can integrate the RHS above nicely in a rewrite/equational rule, you can replace f by fTotal.

Putting this in the backlog until the ongoing work to add SMT support to the booster is done

runtimeverification / k