Replace rule list with "DSL"

ekpyron commented 5 years ago

TBD

We can probably just use a triplet of Yul expressions, resp. expression patterns:

precondition: evaluates to non-zero, if the rule can be applied
match: pattern expression for matching the rule
replacement: pattern expression to replace the matched code snippet

The match expression can use symbolic pattern variables for matching variables and constants and (at a later stage when extending this from expressions to statements) arbitrary code blocks. Precondition and replacement expressions can use any pattern variables used in the match expression.

christianparpart commented 5 years ago

@ekpyron can you give some example? (or in other words: when can we talk face-to-face about it?) I'm interested in taking this one, but I'd need some more (precise) input of what you have in mind. :)

ekpyron commented 5 years ago

This issue was mainly meant as a reminder to discuss this further - the idea came up with @chriseth at devcon, but I think this might need some more careful and future-proof planning before we actually start with it.

chriseth commented 4 years ago

Examples for potential syntax:

//-comments are allowed everywhere.
rules are separated by empty (non-comment) lines
first line is match expression, second line is replacement
we need a way to specify conditions on the dialect (EVM, wasm, hasSelfbalance, ...)
complictated yul blocks are probably out of scope for now.
we need a way to denote computations on constants. For simplicity of parsing, I would propose functional notation (@+)
we should make it more obivous whether a placeholder can match anything or just literals. My proposal would be to prefix placeholders with # if they should only match literals

-EVM hasSelfbalance
// need to specify dialect, with restrictions on some rules

balance(address())
selfbalance()

//comment

add(#A, #B)
@+(A, B)

add(X, 0)
X

// rules that remove or reorder placeholders
// only match if the placeholders are movable
// This is a feature of the engine
mul(X, 0)
0

if 1 { T }
{ T }

if 0 { T }
{}

// what are the conditions on T here?
T revert(X, Y)
revert(X, Y)

// can we have "same match" on full statements?
switch X case 0 { T1... T } default { T2... T}
switch X case 0 { T1... } default { T2... } T

axic commented 3 years ago

There's also #5542 as a relevant issue.

axic commented 3 years ago

first line is match expression, second line is replacement

If we have such a system, I think < and > prefixes would be nice (diff-style), e.g.:

< balance(address())
> selfbalance()

But @ekpyron's proposal in #5542 of defining equalities may be more intuitive:

balance(address()) == selfbalance()

axic commented 3 years ago

In this the core syntax is equality of expressions (from @ekpyron). Where there is no loss of information (replacing instructions with faster instructions, such as shifts), it could reversible too.
A condition can be introduced via iff <expr>. Expressions here must be "constexpr". We borrow the idea from Rust to mark these constexpr versions (macros) with exclamation mark, e.g. sub!, lt!, etc.
Any number of whitespace is allowed between both the rewrite rule and the condition.
Expressions are EVM opcodes (or yul dialect builtins).
Conditions are a different set of expressions/builtins. Special conditionals are introduced for wordsize, bit testing, etc.
The terms A, B, C mean literal on the stack, while X, Y, Z mean non-literals on the stack.
Properties/attributes are introduced via [@<key> <value>], eg. [@evm hasSelfbalance]. It is only valid for the following rewrite rule.
// marks comments.

I played with the syntax a bit and tried to translate a set of random steps from RuleList.h.

//
// Simple rules
//

[@evm hasSelfbalance]
balance(address()) == selfbalance()

exp(0, X) == iszero(X)

exp(1, X) == 1

[@evm hasBitwiseShifting]
exp(2, X) == shl(X, 1)

exp(-1, X) == sub(iszero(and(X, 1)), and(X, 1))

// move constants across subtractions
// X - A -> X + (-A)
sub(X, A) == add(X, sub!(0, A))

//
// Rules with conditions
//

//        rules.push_back({
//                Builtins::BYTE(A, Builtins::SHR(B, X)),
//                [=]() -> Pattern { return Word(0); },
//                [=] { return A.d() < B.d() / 8; }
//        });
byte(A, shr(B, X)) == 0 iff lt!(A, div!(B, 8))

//        // Replace MOD X, <power-of-two> with AND X, <power-of-two> - 1
//        for (size_t i = 0; i < Pattern::WordSize; ++i)
//        {
//                Word value = Word(1) << i;
//                rules.push_back({
//                        Builtins::MOD(X, value),
//                        [=]() -> Pattern { return Builtins::AND(X, value - 1); }
//                });
//        }
//
// In these cases it may make more sense to just expand it manually.
mod(X, 1) == and(X, 0)
mod(X, 2) == and(X, 1)
mod(X, 4) == and(X, 3)
// But could also consider adding a pseudo-variable.
mod(X, shl!(1, K)) == and(X, sub!(shl!(1, K), 1))) iff and!(gte!(K, 0), lte!(K, wordsize!()))

// Also extra whitespaces can be added for readability.
mod(X, shl!(1, K)) == and(X, sub!(shl!(1, K), 1)))
   iff
and!(
  gte!(K, 0),
  lte!(K, wordsize!())
)

//                {Builtins::SHL(A, B), [=]{
//                        if (A.d() >= Pattern::WordSize)
//                                return Word(0);
//                        return shlWorkaround(B.d(), unsigned(A.d()));
//                }},
shl(A, B) == 0 iff gte!(A, wordsize!())
shl(A, B) == shl!(A, B)

// {Builtins::LT(A, B), [=]() -> Word { return A.d() < B.d() ? 1 : 0; }},
lt(A, B) == 0 iff gte!(A, B)
lt(A, B) == 1 iff lt!(A, B)

axic commented 3 years ago

If we want to not have magic variable names, then as mentioned we need a way to mark literals. I think the # prefix on the match is an okay way do it, but probably makes sense to keep that prefix on both sides.

axic commented 3 years ago

Here are some more, line-by-line translations:

 add(#A, #B) == add!(A, B)
 mul(#A, #B) == mul!(A, B)
 byte(#A, #B) == 0 iff gte!(A, div!(wordsize!(), 8))
 byte(#A, #B) == shr!(B, and!(mul!(8, sub!(sub!(div!(wordsize!(), 8), 1), A)), 0xff)) iff lt!(A, div!(wordsize!(), 8))
 shl(#A, #B) == 0 iff gte!(A, wordsize!())
 shl(#A, #B) == shl!(A, B)
 shr(#A, #B) == 0 iff gte!(A, wordsize!())
 shr(#A, #B) == shr!(A, B)
 sub(not!(0), X) == not!(0)
 mul(X, not!(0)) == sub(0, X)
 mul(not!(0), X) == sub(0, X)
 and(X, not!(0)) == X
 and(not!(0), X) == X
 eq(X, 0) == iszero(X)
 eq(0, X) == iszero(X)
 gt(X, 0) == iszero(iszero(X))
 lt(0, X) == iszero(iszero(X))
 gt(X, not!(0)) == 0
 lt(not!(0), X) == 0
 and(byte(X, Y), 0xff) == byte(X, Y)
 byte(sub!(div!(wordsize!(), 8), 1), X) == and(X, 0xff)
 not(not(X) == X
 xor(X, xor(X, Y)) == Y
 or(not(X), X) == not!(0)
 and(and(X, Y), Y) == and(X, Y)
 mod(X, 1) == and(X, 0)
 mod(X, 2) == and(X, 1)
 mod(X, 4) == and(X, 3)
 mod(X, shl!(1, K)) == and(X, sub!(shl!(1, K), 1))) iff and!(gte!(K, 0), lte!(K, wordsize!()))
 byte(#A, shr(#B, X)) == 0 iff lt!(A, div!(B, 8))
 and(address(), sub!(shl!(1, 160), 1)) == address()
 and(caller(), sub!(shl!(1, 160), 1)) == caller()
 and(origin(), sub!(shl!(1, 160), 1)) == origin()
 and(coinbase(), sub!(shl!(1, 160), 1)) == coinbase()
 iszero(iszero(eq(X, Y))) == eq(X, Y)
 iszero(iszero(lt(X, Y))) == lt(X, Y)
 iszero(iszero(slt(X, Y))) == slt(X, Y)
 iszero(iszero(gt(X, Y))) == gt(X, Y)
 iszero(iszero(sgt(X, Y))) == sgt(X, Y)
 iszero(iszero(iszero(X))) == iszero(X)
 iszero(xor(X, Y)) == eq(X, Y)
 byte(#A, shr(#B, X)) == 0 iff lt!(A, div!(B, 8))
 sub(X, #A) == add(X, sub!(0, A))
 sub(add(X, #A), Y) == add(sub(X, Y), A)
 [@evm hasSelfbalance]
 balance(address()) == selfbalance()
 exp(0, X) == iszero(X)
 exp(1, X) == 1
 [@evm hasBitwiseShifting]
 exp(2, X) == shl(X, 1)
 exp(-1, X) == sub(iszero(and(X, 1)), and(X, 1))

chriseth commented 3 years ago

This looks very good! Some remarks:

I'm not sure about the == if the two sides are not really equal (you can use lt!(#A, #B) on the RHS but not on the LHS).
iff should be if
maybe we should replace wordsize by bitsPerWord and bytesPerWord
maybe use !lt instead of lt!?

Finally: Why not have

 exp(2, X) == shl(X, 1) if  evm!(hasBitwiseShifting)

instead of

 [@evm hasBitwiseShifting]
 exp(2, X) == shl(X, 1)

axic commented 3 years ago

I'm not sure about the == if the two sides are not really equal (you can use lt!(#A, #B) on the RHS but not on the LHS).

You can use the evaluated terms on either side, but perhaps it is better to go with the direction indicator -> and not allowing implicit reversibility.

maybe we should replace wordsize by bitsPerWord and bytesPerWord

We could also decide to make the rulelist dialect specific and then this is not needed.

Why not have evm!(hasBitwiseShifting)

That also sounds like an option, but would make it look like the datasize builtin and use a literal and not identifier: evm!("hasBitwiseShifting")

maybe use !lt instead of lt!?

Not sure, i personally like the ! suffix better.

ekpyron commented 3 years ago

I just saw this discussion and would bring up the following:

I don't really see any value in rules like add(#A, #B) == add!(A, B). I would rather split evaluation away from simplification. Rather than evaluation being simplification rules, I would just mark functions like add as pure in the dialects and add a C++ evaluation function to them, s.t. such pure functions can generally be evaluated whenever all arguments have constant values. What is the benefit of introducing new syntax like add! for which you then just again have to define the semantics in C++ anyways? To me that seems like just introducing a whole lot of rather vacuous rules.
I don't think there needs to be a distinction between #A and A nor between gt and gt!. If you remove every # and every ! from the rule list, then every rule will still be valid semantically and I argue below that they are also just as easy to apply without the additional annotations.
If there are constants like not(0) on the left hand side I would just do "pure function evaluation" when loading the rules. So when the compiler loads the rules, it recognizes that not is pure and 0 is constant and evaluates not(0) before it internally stores the rule - this can be the very same pure function evaluator that I would run on the code instead of having add(#A, #B) == add!(A,B)-style rules, so it's near zero additional effort.
In general a rule like add(X, 0) == X is just shorthand for add(X, Y) == X if eq(Y, 0), i.e. the general format of any rule just has (potentially nested) function applications with only variables as arguments on the LHS and any expression on the RHS with only the restriction that any free variable of the RHS has to be a free variable of the LHS (this same restriction holding for the condition expressions).

So the current simplifications can be replaced by the following steps:

First of all pure functions that have only arguments with known values are evaluated using the evaluation function given in the dialect.
Rule are preprocessed before application (only has to be done once per dialect), first evaluating constants (considering wordsize() or whatever it'll be called as additional pure builtin function it can evaluate), then replacing each constant on the LHS by new variables while either adding a condition eq(newvar, <constant>) or replacing an existing condition by and(eq(newvar, <constant>), <oldcondition>).
Now on actual code whenever the left-hand-side pattern is encountered, it can be directly replaced by the right-hand-side pattern, if athe pure function evaluator evaluates the condition expression to non-zero. The advantage here is that in the future we can add more involved reasoning in this evaluation process. A full "reasoning based simplifier" like approach would be way too expensive, but simple inequality reasoning should be good enough, i.e. compare the currently proposed shl(#A, #B) == 0 iff gte!(A, wordsize!()) with shl(X, Y) == 0 if gte(X, wordsize()) If X is non-constant, but I know that gte(X, 1000), while wordsize() evaluates to 256, I can still apply the rule and correctly so. And for this to work only very simple range tracking and inequality-reasoning in the evaluator would suffice.

But yeah, just in case you want to consider this :-). It's not really an issue to first translate the rules into the verbose format, because we can still change it by simply removing any ! and # at any later point. I'd suggest to group "evaluation rules" and proper simplification rules together, though. Also my suggestion is tailored to the new optimizer and only the new optimizer. If we want to use the DSL for the old optimizer as well, we'll probably need these annotations, because we probably wouldn't want to rewrite any of that this much (for the new optimizer on the other hand I would argue that it would run more smoothly and be significantly simpler with the more general rules and split out "pure builtin function evaluation").

chriseth commented 3 years ago

Your proposal sounds very good to me, @ekpyron!

I'm wondering how the "redundant masking" rule would look like in his framework.

and(x, y) == x if eq(and(x, y), x)

This looks trivial, but we have to given the reasoning engine something to work with.

ekpyron commented 3 years ago

That's tricky... and(x, y) == x if eq(and(x, y), x) would "just work" in the framework I described without any special care whenever x and y are constants, resp. have known values, but then it is trivial and the rule wouldn't be needed anyways, because and(x,y) could just directly be evaluated.

For non-constants, we could try to optimize the reasoning engine for special cases like y is a constant power-of-two-minus-one and x has an upper bound - that'd be possible. Actually one could even formulate a rule specifically for that case that is easy to evaluate without special support in the reasoning engine - and(iszero(iszero(y)), iszero(and(y, sub(y, 1)))) should be "y is a power of two", so and(x, y) == x if and(and(iszero(iszero(add(y,1))), iszero(and(add(y, 1),y))), iszero(gt(x,y))) is something that should work in this special case and should be evaluatable for constant y using simple inequality reasoning about x and y only... but I wouldn't trust that I got that right without SMT-veryfing that rule :-) - and then the question is, where does it end and the more cases we want to cover the more this task would become "implement an SMT solver in the rule list or its reasoning engine", which is way beyond the scope of the original idea of this issue...

But this shows how complex reasoning engines can help applying rules in this framework - if we were to simultanously SMT-encode the Yul AST while applying the rules, then we could query an SMT solver to show that iszero(<condition>) is UNSAT (probably with a very small timeout, since otherwise it'd slow things down too much) - and if it is, apply the rule. Actually this could even replace "pure constant function evaluation" (at least when applying the rules), if we supplied SMT semantics for all "pure" builtins - I'd expect SMT solvers to just instantly evaluate those on their own then and eq(constant1, constant2) would be instantly SAT or UNSAT as well. But we're back to the trust issues against SMT solvers plus it'd still probably degrade performance...

But yeah, my idea was not to do anything like that or any reasoning whatsoever in a first step, but just have plain dumb evaluation of constants for the conditions - and in that case a rule for redundant masking wouldn't be possible as far as I can see.

ethereum / solidity

Replace rule list with "DSL" #7520