Check that LHS and RHS of simplification and function rules are consistent

jberthold commented 1 year ago

Function and simplification rules should be checked to ensure that the left-hand side and the right-hand side are either both predicates (evaluate to \top or \bottom or both function-like (function or constructor app.s, domain values).

Rules like this one rule #bufStrict(_, _) => #Bottom where a function (function-like) evaluates to a plain \bottom (predicate) should be a compiler error.

ehildenb commented 1 year ago

@virgil-serbanuta will making this restriction on rules in the semantics rules break anything?

We want to add the additional restriction that the requires and ensures clauses do not have any ML connectives.

virgil-serbanuta commented 1 year ago

Ok... I'm not sure I understand your question properly, but let me try:

First, I assume that by "break anything" you mean "break any existing projects" or something similar. I don't think it would break much besides the current elrond+wasm thing I'm building, and, even there, most of the broken code will be in the part that gets thrown away when the Elrond semantics will be ready for symbolic execution.

However, I think that the above may force people to work in suboptimal ways. Below I tried to describe why ML connectives may be useful in the contexts you mentioned, and, sometimes, provide some suggestions about how to replace ML connectives with something better.

I think that, in the past, rules like rule #bufStrict(_, _) => #Bottom were useful for making the Haskell backend remove some branches. My guess is that the right way to solve this issue is to create a #Ceil rule for #bufStrict and file a backend issue if that does not work (however, given how long it takes sometimes for them to fix an issue, having a workaround may be nice; should a project stop for a few days until the backend fixes the issue? for a week? a few months? should it be rewritten if the issue is never solved? Having more power makes the time it takes to fix bugs less of an issue - this also applies to most things mentioned below).

Speaking of which: Why aren't there #Ceil rules for all partial functions in domains.md?

It is useful to have ML-like connectives in the LHS, e.g. when writing a rule rule f(X) => g(X), where f is a partial function and g is total, but they are equal when f is defined, there are three options:

rule f(X) #as _Ceil => g(X) [simplification]  // Not allowed currently, I'm not sure why.
rule f(X) => g(X) requires (true #And #Ceil(f(X))) [simplification]  // You are currently considering not allowing this.
rule f(X) => g(X) requires fIsDefined(X) [simplification]  // I will discuss this below.

First, note that the first rule is also the shortest, and, unlike the third does not require the user to figure out if there is any fIsDefined function and, perhaps, to write a new one if there is no such function.

The last rule above is suboptimal currently, but might become a really good thing in the future. Here are the current issues:

If the user writes a condition by hand there, e.g. X >Int 0, then there is a higher risk of making mistakes.
If the user defines fIsDefined, but f is defined in a semantics that is not part of the current project, then either the user duplicates the condition used in #Ceil(f) (if #Ceil is defined), or the user has a higher risk of making mistakes.
If the user can change f's module, then the user can define fIsDefined the proper way, and can write a #Ceil rule that calls fIsDefined.

However, this could be better. It would be nice if, for each function f for which there is another (total) function, say, fIsDefined, #Ceil would be automatically defined. This could be implemented with an attribute, e.g. syntax Bool ::= fIsDefined(same-args-as-f) [function, total, ceil(f)].

To make a summary, the best solution might be to define fIsDefined and #Ceil as described in the paragraph above, but then still use the first rule, rule f(X) #as _Ceil => g(X) [simplification], although other people may think differently than me.

Huh, I should have written a design doc.

P.S. Users may want to write rules like f(X) => h(X) requires #Not (#Ceil(X))

It is sometimes useful to create simplification rules that use ML connectives in the RHS for splitting the configuration (rule lhs => (rhs1 #And P1) #Or (rhs2 #And P2)).

This can be useful for simplifying, say, symbolic map expressions made of _[_-<_] and _[_] (with or without orDefault), when the user would might want to split anyway the configuration in order to simplify, but splitting this with rewrite rules may be hard. This can usually be worked around, though I don't know of any other general solution for simplifying these (i.e. M[X -> Y][Z] can't be simplified in general, but, with splitting, it can be reduced to M[Z] #And... and Y #And ...). This is used in the new Elrond + wasm tests I'm doing, but, as mentioned, I can probably work around that.

ML connectives in requires and ensures: Besides the rule f(X) => g(X) requires (true #And #Ceil(f(X))) [simplification] example above, one may also want to write, say, rewrite rules like these:
```
rule lhs(X) => f(X) requires true #And #Ceil(f(X))
rule lhs(X) => g(X) requires true #And (#Not (#Ceil(f(X))))
```
Note that both of these are currently preferable to most other options (hand-written expansion of #Ceil or having to define one's own fIsDefined function that duplicates the #Ceil condition from another module).
Let me give you an two examples that could have been solved by restricting the RHS of functions. One is an issue that took years to fix (and, as far as I know, was fixed by Dwight, not by the backend team), the other one is an issue that is still not fixed after a few years. I think I filed issues about both of these when I was working on the Multisig contract, and I think I spent months only on debugging and working around these (it was a long time ago, but I'm guessing that "months" is the right unit of time).

First, Map unification did not work in a very common (at least for me) use-case for a long time. Unifying (M 1 |-> 2) with (N 3 |-> 4) produced something like (M 1 |-> 2) #Equals (N 3 |-> 4), and replaced all occurrences of one of these maps with the other one. I think that this is fixed currently.

Second, map unification does not use in_keys, z3 does not use in_keys, so the backend can't remove invalid branches. To see a simple example, consider this rule-claim pair:

  syntax A ::= a(Map) | b(Map)
  rule a(X 1 |-> 2) => b(X)
----
  claim a(M:Map) => b(?_M:Map) requires 1 in_keys(M)

This does not pass because the backend can't figure out that 1 in_keys(M) implies that unification(M, X 1 |-> 2) does not leave a remainder.

I currently don't know of any reasonable way of making cfg(M) #And 1 in_keys(M) unify with X 1 |-> 2. I know an unreasonable one, that made the Multisig project work. However, both of these are easy-ish to fix if, in a function's RHS, one can introduce new map variables (AFAIK, not currently possible, at least not in a way that makes this work, allowing #Exists over the entire equation would have been nice), #And and #Or.

ehildenb commented 1 year ago

The expression true #And { #Ceil(f(X)) }, does this really make sense though? Also, you can always represent this as a new function defined_f : Arg -> Bool, and use true andBool defined_f(X) instead, I think? This is similar to what the in_keys(...) predicate gives you on maps, so that you can check that a key is present before doing a lookup, for example.

Specifically, we want to amke the restrictions:

No ML connectives allowed in req/ens clauses.
No mixing of ML connectives with function-like terms in LHS/RHS. The LHS and RHS must both be function-like, or they must both be ML connectives at the top.

Do you have any examples in Elrond semantics where you cannot work around these restrictions? The idea is that without these restrictions, it's too easy to write non-sensical rules. Likely we will be placing more restrictions in the future too, or fine-tuning them a bit, but this seems like a reasonable starting point.

virgil-serbanuta commented 1 year ago

The expression true #And { #Ceil(f(X)) }, does this really make sense though? Also, you can always represent this as a new function defined_f : Arg -> Bool, and use true andBool defined_f(X) instead, I think? This is similar to what the in_keys(...) predicate gives you on maps, so that you can check that a key is present before doing a lookup, for example.

Yes, indeed, that's what I proposed above. I did even more than that and I said that, if a certain attribute is present on the defined_f symbol, a corresponding #Ceil simplification rule should be generated.

If you're interested, I have more thoughts on how this can be extended in order to fix the broken-ish state of partial symbols.

Do you have any examples in Elrond semantics where you cannot work around these restrictions? The idea is that without these restrictions, it's too easy to write non-sensical rules. Likely we will be placing more restrictions in the future too, or fine-tuning them a bit, but this seems like a reasonable starting point.

I don't have any such examples.

One of my concerns is that there will be less ways to work around Haskell backend limitations or bugs that may stay unresolved for years, as the ones I mentioned above.

Another concern is that it would be tedious to translate #Ceil(complex-term) to defined_stuff(...) andBool defined_other_stuff(...) andBool ....

runtimeverification / k

Check that LHS and RHS of simplification and function rules are consistent #3284