essential-contributions / pint

Pint, the constraint-based programming language for declarative blockchains
Apache License 2.0
17 stars 5 forks source link

Consider (seriously) using state read ASM for var initialisers. #892

Closed otrho closed 3 weeks ago

otrho commented 1 month ago

Background

Over the journey we've iterated on trying to find the best way to declare decision variables and state. We used to have let instead of var and we didn't have const until it was deemed necessary.

Now we have var to name a decision variable slot, and state to name a state slot. If a var has an initialiser then a constraint is injected to ensure the variable is solved to that single expected value. If a state has an initialiser, as of today is must refer to storage and will emit some state read ASM to fetch it and populate the slot.

Problem

There are small but non-trivial problems with both these initialisers.

Decision variables

var initialisers are often redundant. If an initialiser does not refer to state then it is essentially a constant, and it has been assumed that one day instead of keeping these as vars they would be optimised away, via constant propagation. This is fine, but is hasn't happened yet. Making the solver be responsible for initialising a decision variable which is basically a constant is still redundant, if this optimisation isn't feasible.

But another problem might be that building the constraint for the initialiser is overly convoluted and therefore solving it is also prone to error. I found this recently with implementing union variant immediates.

Typically the constraint for an initialised var is a simple == expression. E.g., var a: int = 1; becomes var a: int; constraint a == 1;. But equality comparisons are not legal for union variant immediates as the LHS and RHS may be different sizes. (It may be possible to implement with some crafty runtime checks but these are not desirable at the moment.) So for e.g., var a: option = option::some(1) the constraint can't be constraint a == option::some(1); and would instead have to be:

constraint match a {
    option::some(n) => n == 1,
    else => false,
};

This is a lot of mucking around and perhaps wasted gas just to have a 'constant variable' and this is a simple case.

State

The only minor problem with state is that they must just read storage. I assume this will be expanded in the future to allow more arbitrary expressions, allowing state to be initialised based on other state, or perhaps variables. I'm scratching my chin over that, I need to be more familiar with how it works... but...

A remedy

It seems to me that vars which have initialisers (which aren't constant propagated away) should instead be put in state slots and initialised with state read ASM. That way we could initialise our var a: option = option::some(1); to be state a: option = option::some(1);, to literally initialise that slot to a tagged union, something like { 1, 1 }, and not to require a constraint to be either solved nor validated.

I might be missing something here though, like perhaps that constraint is in fact less expensive than the state initialiser. But we talked about this in a call recently and seemed to agree this might be a good idea. That was before I realised simple == between union variants was infeasible and so that only bolsters the argument for this change.

Curious for input @mohammadfawaz and @freesig

freesig commented 1 month ago

Yeh state initializers are more expensive to run for multiple reasons. This would be a lot cheaper in a constraint. I do wonder though if we just want to remove var a: type = val? Like you said most of the time the dev really means const A: type = val;. If they want to do something like:

state x = storage::x;
var a: int = x + 2;
var b: int = a + x;

Then they can write it explicitly:

constraint a == x + 2 && b == a + x;

I think that's fairly reasonable.

Also state initializers mean something a little different. A var is always provided by the solver like you said so:

var a: int;
var b: int;
constraint a > 1;
constraint b > a;

Both these constraints can begin checking immediately in parallel. A state is not provided by the solver so:

state a = storage::a;
state b = storage::b;
constraint a > 1;
constraint b > a;

Both constraints need to wait for the state initializer programs to run first. Even if we introduce state initializers that run once and don't touch storage like:

var a: int[300];
state b = hash(a);
var c: b256;
constraint b != c;

The constraint still needs to wait for it to run.

It might get a bit odd if a dev writes:

state a = 20 * 2;
var b: int;
constraint a + b > 100;

The state initializer should be a const here. Maybe we can optimize it but also maybe a warning is enough.

Things like:

state a = 20 * 2;
constraint a > 10;

What should happen here? Seems like it should all be trivially optimized out because we know it's always going to return true at compile time. Or

const a: int 20 * 2;
constraint a > 50;

This can't be removed but I guess it could be optimized down to constraint false; but also maybe it should be a warning.

mohammadfawaz commented 1 month ago

It seems to me that vars which have initialisers (which aren't constant propagated away) should instead be put in state slots and initialised with state read ASM.

Yes I think this is the way to go. That or const.. Either way, vars should probably not have initializers.

mohammadfawaz commented 3 weeks ago

vars are not predicate parameters. This is no longer an issue.