Add `by` blocks to function and method calls

atomb commented 7 months ago

Summary

We could reduce brittleness by allowing the preconditions for a function or method call to be proved in local blocks, rather than scoped to the entire definition in which the call occurs.

Background and Motivation

Reducing verification brittleness often requires hiding information from the solver, so as to allow it to focus on the information most relevant to the current goal. The assert ... by { ... } construct currently allows intermediate assertions, lemma calls, and other proof steps to appear in the by block, and the facts these steps introduce are visible only within the by block itself and the proof obligation stating the validity of the assertion. Calls to other functions and methods are one of the most significant sources of new proof obligations, and currently cannot benefit from a similar mechanism.

Proposed Feature

Extend the language so that a method call of the form o.m(...); could also be written o.m(...) by { ... }. The contents of the by block would be used in proving that any preconditions of m are satisfied. This would ideally be implemented similarly to how assert ... by is implemented.

Open question: should the contents of the by block also be used to prove the well-formedness of the entire call expression or statement?

Alternatives

Not having this feature, or something similar, limits how well we can reduce brittleness. There are a number of other things we could do that would address either some or all of the goals of this proposal, however.

We could allow a by { ... } block to appear on any sub-expression, which would allow proof of the preconditions for function calls but not method calls. It may be that we do this, and that function calls are handled by this more general mechanism. Even if we aim to do this for arbitrary expressions, implementing it for function call expressions may be a good place to start.
We could allow specific preconditions to be opaque and named from the caller's perspective. This would allow us to write something like assert m.Pre1 by { reveal m.Pre1; ... }; o.m(...). This would achieve largely the same goals, but may be more verbose. Note that both features could also co-exist. And this feature could be approximated, with no language changes, by making all preconditions in a program directly call opaque functions.
We could provide a general mechanism for limiting the scope of new facts that arise from assertions, lemma calls, postconditions, etc. By placing a call within a more limited scope, it would be possible to prove any side conditions of it in such a way that they don't impact other proofs outside of the block.

robin-aws commented 7 months ago

I feel like the readability of <expr> by { ... } is not quite as good as assert <expr> by { ... } - in the latter case the "by" is a clause on the fact that you're asserting something is true, and here's why. Whereas with f(x) by { assert 0 < x; }, it's not immediately clear what assertion or action the "by" is attached to.

We could provide a general mechanism for limiting the scope of new facts that arise from assertions, lemma calls, postconditions, etc. By placing a call within a more limited scope, it would be possible to prove any side conditions of it in such a way that they don't impact other proofs outside of the block.

This is how I'm leaning. I wonder if we could leverage the opaque keyword to have a generic way of opening a block that confines the additional facts it introduces to that scope:

var result: nat;
opaque {
  assert 0 < x;
  result := f(x);
}

We'd likely need an expression variant of this as well, and I'm less sure of the optimal syntax there. Perhaps:

var result: nat := opaque(assert 0 < x; f(x));

atomb commented 7 months ago

I feel like the readability of <expr> by { ... } is not quite as good as assert <expr> by { ... } - in the latter case the "by" is a clause on the fact that you're asserting something is true, and here's why. Whereas with f(x) by { assert 0 < x; }, it's not immediately clear what assertion or action the "by" is attached to.

We could provide a general mechanism for limiting the scope of new facts that arise from assertions, lemma calls, postconditions, etc. By placing a call within a more limited scope, it would be possible to prove any side conditions of it in such a way that they don't impact other proofs outside of the block.

This is how I'm leaning. I wonder if we could leverage the opaque keyword to have a generic way of opening a block that confines the additional facts it introduces to that scope:
var result: nat;
opaque {
  assert 0 < x;
  result := f(x);
}

I do like this, and it's very general. I wonder whether, if we did this, we could make it possible for any block to be opaque. That is, in addition to examples like the one above, we could say something like:

if P opaque { s1 } else opaque { s2 }

That could possibly introduce parsing issues, especially if opaque(e) is allowed inside P. A potential alternative would be to use different grouping characters for opaque blocks. So maybe if P {{ s1 }} else {{ s2 }} would keep new facts from either s1 or s2 from leaking out. I'm not sure what I think of that.

keyboardDrummer commented 7 months ago

. This is how I'm leaning. I wonder if we could leverage the opaque keyword to have a generic way of opening a block that confines the additional facts it introduces to that scope:

I would think that usually you don't want to confine all the additional facts to a particular scope. For example if I have:

assert preConditionOfFooCall;
var x := fooCall(a, b);
assert makeUseOfFooCallPostcondition(x)

I cannot rewrite this to:

opaque {
  assert preConditionOfFooCall;
  var x := fooCall(a, b);
}
assert makeUseOfFooCallPostcondition(x)

What I would need is more like:

knowing {
  assert preConditionOfFooCall;
} do {
  var x := fooCall(a,b);
} until {
  assert makeUseOfFooCallPostcondition(x)
}

The last block { } helps to make it clear that the variable x is in scope. You could remove the {} for brevity, but then I think it would read as if x is not in scope there.

What I don't like about the above is how verbose it is. Alternatively, you enable a shorter syntax that only allows providing knowledge that's used for a single statement, by introducing by { ... } as a generic suffix for statements, like so:

var x := fooCall(a,b) by { // The by clause can be added at the end of a statement, like assert .. by
  assert preConditionOfFooCall;
};
assert makeUseOfFooCallPostcondition(x);

For expressions, we already have assert <proof>; <expr>. I would think that <proof> here is only in scope for <expr>, and does not leak to the outside. Would that be enough for providing proofs within a bounded scope? I vaguely recall @MikaelMayer changed (or wanted to change) the semantics around this, so the proof would not leak to the outer expression.

robin-aws commented 7 months ago

Agreed that in some cases you'd want to expose some facts from an opaque block, and I was also going to suggest supporting:

opaque ensuring postConditionOfFooCall {
  assert preConditionOfFooCall;
  var x := fooCall(a, b);
}
assert makeUseOfFooCallPostcondition(x)

That feels pretty natural given the precedent in forall statements.

atomb commented 7 months ago

Agreed that in some cases you'd want to expose some facts from an opaque block, and I was also going to suggest supporting:
opaque ensuring postConditionOfFooCall {
  assert preConditionOfFooCall;
  var x := fooCall(a, b);
}
assert makeUseOfFooCallPostcondition(x)

What would the difference be between that and the following?

assert postConditionOfFooCall by {
  assert preConditionOfFooCall;
  var x := fooCall(a, b);
}
assert makeUseOfFooCallPostcondition(x)

The one thing I can think of is that it would allow non-ghost code. Do you have any other differences in mind?

robin-aws commented 7 months ago

Allowing non-ghost code is pretty key though, if we're trying to provide a way to attach proofs to arbitrary pieces of code to prove it's well-formed/corect, without having those facts leak into the rest of the code.

robin-aws commented 7 months ago

corect

The irony :P

atomb commented 7 months ago

Allowing non-ghost code is pretty key though, if we're trying to provide a way to attach proofs to arbitrary pieces of code to prove it's well-formed/corect, without having those facts leak into the rest of the code.

Oh, yes, I entirely agree. We need to allow non-ghost code in this new feature. But if that's the only difference with assert by, it's pretty small change (especially to a user). I could almost argue that we could allow non-ghost code in assert by with no syntactic changes. But the terms assert and by do really suggest a proof-only context. So maybe a different syntactic form makes sense.

robin-aws commented 7 months ago

More than that, I think at a deep level if you say assert E by { argument }, the "argument" really needs to be only proof code, essentially an inline lemma. Allowing the argument to be arbitrary code with side-effects, even just assigning variables with a wider scope than the argument, really violates what by blocks should be for.

For another perspective: by blocks are always erased, opaque blocks certainly can't be.

robin-aws commented 7 months ago

One related point is that if we had opaque ensuring E { code }, assert E by { proof } would no longer be strictly needed, since you could always turn than into opaque ensuring E { proof } - just a special case where the opaque block happens to be only proof code with no side-effects.

That special case is probably still worth its own syntax for readability, though, so I wouldn't suggest deprecating assert E by { proof}.

atomb commented 7 months ago

One related point is that if we had opaque ensuring E { code }, assert E by { proof } would no longer be strictly needed, since you could always turn than into opaque ensuring E { proof } - just a special case where the opaque block happens to be only proof code with no side-effects.

That special case is probably still worth its own syntax for readability, though, so I wouldn't suggest deprecating assert E by { proof}.

Yep, that all sounds good to me.

One question that raises, which comes up in the example @keyboardDrummer posted above. It seems that side effects from opaque ensuring certainly need to be visible outside the block. What about declarations? It seems odd to allow them to escape, but I also tend not to like declaring variables without immediately assigning them (though Dafny does protect against the bad things that can happen when you do that), and it'll frequently be necessary to capture the results of method calls.

It also seems like this structure might generalize to expressions, as well. I'm not sure if the same syntax makes sense, but the same overall structure might.

Perhaps it would also be possible to leave out the ensuring E if you really want to hide everything. This could be true for certain method calls.

keyboardDrummer commented 7 months ago

Agreed that in some cases you'd want to expose some facts from an opaque block, and I was also going to suggest supporting:
opaque ensuring postConditionOfFooCall[p1->a,p2->b,result->x] {
  assert preConditionOfFooCall;
  var x := fooCall(a, b);
}
assert makeUseOfFooCallPostcondition(x)
That feels pretty natural given the precedent in forall statements.

Having to write postConditionOfFooCall[p1->a,p2->b,result->x], which can be a large expression, can be avoided with what I suggested before:

var x := fooCall(a,b) by { // The by clause can be added at the end of a statement, like assert .. by
  assert preConditionOfFooCall;
};
assert makeUseOfFooCallPostcondition(x);

Also, to make opaque ensures work, I think you need to add some automation:

Identify declared variables and declare them before the block

Identify modified variables and havoc them after the block

// implicit var x;
opaque 
ensuring postConditionOfFooCall[p1->a,p2->b,result->x] 
ensuring postConditionOfFooCall[p1->a,p2->b,result->y]
{
assert preConditionOfFooCall;
var x := fooCall(a, b);
y := fooCall(a, b);
}
// implicit havoc x;
// implicit havoc y;
// implicit assume postConditionOfFooCall[p1->a,p2->b,result->x]  && postConditionOfFooCall[p1->a,p2->b,result->y]
assert makeUseOfFooCallPostcondition(x)

I think both of the above automations are not intuitive:

The block structure implies the declared variable x is not in scope outside of the block
That any modified variables are havocked is not apparent. I think you could partially remedy this by only allowing opaque ensures block to modify variables that are declared inside of it.

keyboardDrummer commented 7 months ago

It also seems like this structure might generalize to expressions, as well. I'm not sure if the same syntax makes sense, but the same overall structure might.

What are we missing for expressions? I feel the tree structure of expressions already enables granular control of which information is available where, with the rule that information only flows down the tree.

robin-aws commented 7 months ago

Yes I don't think we want declarations to escape opaque { ... } blocks, that would be surprising given no other blocks in Dafny do that. Declaring uninitialized variables is pretty common in other languages for similar reasons though, especially when you need to wrap some code in a try { ... } catch { ... } block and can't declare variables for afterwards in the try block. As you say Dafny is unusually well-equipped to avoid the downsides of that!

I do think we need an expression variant. You could almost get away with just allowing opaque blocks in the statement part of statement-expressions (opaque { ... }; E), but then the facts inside the block shouldn't be usable to make E well-formed/valid.

We could just support the same outer syntax in expression contexts:

var x := opaque {
  var y := SomeLemmaMaybe(z);
  assert preConditionOfFoo;
  Foo(z)
};

In other words in statement contexts you can do opaque { <statements> } and in expression contexts you can do opaque { <expression> }, which evaluates to the same value as the expression in the brackets. I think that's fairly natural since curly braces are already allowed in a few expression kinds, such as match expressions.

Yup, the ensuring E can be optional, just as it is on forall statements. In fact forall statements with no quantified variables are already pretty equivalent to opaque blocks, and I think forall ensuring E { ... } might have been a common idiom, which we deprecated in favour of assert E by { ... }. :)

robin-aws commented 7 months ago

One interesting open question for me, independent of syntax: it seems like we need some way of obtaining a value while being able to hide all the facts needed to obtain it. What about when the type of the obtained value is a subset type? Would we still retain the fact that the value satisfies the subset type predicate, for the sake of the type system?

I'd actually suggest we even throw away that fact, only because then there's the option of allowing it to escape via something like ensuring IsOdd(x) as needed. Whereas if we always allow that fact to escape, it's less clear how to hide it if needed. The only weird aspect to that is whether we'd allow:

var x := opaque ensuring IsOdd(x) { ... }

...given that x is not yet defined on the RHS of that declaration. Again it doesn't feel too weird given you can say var x :| P(x).

robin-aws commented 7 months ago

What are we missing for expressions? I feel the tree structure of expressions already enables granular control of which information is available where, with the rule that information only flows down the tree.

Given verification is ultimately solving an SMT program, that's not fully true. AFAICT it's easy for one assertion to help satisfy another one textually above it.

Even ignoring that, I think this situation is common:

function Foo(x: nat): nat {
   assert preconditionOfBar;
   var y := Bar(x);
   // preconditionOfBar is no longer needed,
   // and can be noise that makes assertions from here on more expensive
   assert preconditionOfArfle;
   Arfle(y)
}

keyboardDrummer commented 7 months ago

Yes I don't think we want declarations to escape opaque { ... } blocks, that would be surprising given no other blocks in Dafny do that.

That still has the downside that it would havoc any variable that it modifies, so you have to put the effect of those modifications in the ensures clause. If you also don't allow modifying variables, then it seems like it would have the same power as the current assert .. by {} statement.

What do you have against allowing a var y := Bar(x) by {} statement ? To me it seems that would give the best UX for the situation which you describe as being common.

Also, how do you compare what you mentioned versus what I mentioned before:

know {
  assert preConditionOfFooCall;
} during {
  var x := fooCall(a,b);
} until {
  assert makeUseOfFooCallPostcondition(x)
}

Now there is no need to write an ensures expression to capture the modifications to x, and the variable x can be declared when it is assigned.

Given verification is ultimately solving an SMT program, that's not fully true. AFAICT it's easy for one assertion to help satisfy another one textually above it.

What I meant was that we should make it so that information only flows down the expression tree. This would mean that

var x := 
  SomeLemmaMaybe(z);
  assert preConditionOfFoo;
  Foo(z);

Would have the same semantics as what you described

var x := opaque {
  SomeLemmaMaybe(z);
  assert preConditionOfFoo;
  Foo(z)
};

MikaelMayer commented 7 months ago

I too prefer that variables declared in any kind of curly braced block, in Dafny, should not escape. So I suggest we focus on solutions that would not break this nice principle.

I also had suggested that in a StmtExpr M(); X, all the information from M() should not be visible outside of the expression, only to prove necessary preconditions of X. That would be true for lemma calls, but also reveal statements, assertions, calculations, etc. Hence, to hide preconditions, you would just prove the precondition, hide it with a label, and reveal it just before calling bar.

function Foo(x: nat): nat {
   assert h: preconditionOfBar;
   var y := (reveal h; Bar(x)); // Parentheses not needed since reveal h introduces a StmtExpr
   // preconditionOfBar is no longer needed but is not visible anyway, yay!
   assert preconditionOfArfle;
   Arfle(y)
}

Assertions and requirements in functions can have labels since June 2023, I added them knowing that they would match what happens in statements and also would enable such scenario about making some facts opaque.

The only problem was for methods calls, I had not suggested anything, so the above would not work if Bar and Foo were methods. This is where I think Aaron's simple proposal would look great, because it generalizes the replacement of the semicolon to a " by {} " block from assertions to method calls. The only minor problem I have with this approach is that, in the case of the assertion, we give facts to help Dafny prove the assertion, whereas in the proposed syntax, we give facts to help Dafny prove the well-formedness of the method call (since we prove its precondition), which is a bit odd. But definitely not a blocker.

An alternative, under the condition we accept that local assertions don't escape, would be to support StmtRHS that contain a ghost statement (for proof) and the RHS (for the method call), like this:

method Foo(x: nat) returns (j: nat) {
   assert h: preconditionOfBar;
   var y := (reveal h;  // Parentheses not needed since reveal h introduces a StmtRHS
                Bar(x)); // Method call.
}

That would have the benefit of looking similar to the scoped assumptions and reveal in expressions, although it might be a bit trickier to parse.

I was suggested also that we could declare a variable "opaque" (both in expressions and in statements), so that it also immediately lets user use the variable's name for revealing it. My original proposal was to use "opaque" as a wrapper of expressions but it would require to introduce a label anyway like "opaque h: expression" and would make things a bit harder to handle.

atomb commented 7 months ago

After all of this discussion, I'm leaning in a similar direction to @MikaelMayer, I think. More specifically:

I really like the idea of the expression form (assert P; e) keeping P local to the parenthesized block, without any additional keywords. It's possible that this would break existing code, and I'd be in favor of requiring some other delimiter or keyword in that case.
The idea of opaque variables is one that I now remember discussing a while back, and I also like it as a nice way to break an expression into opaque pieces.
I think that, for syntactic convenience, assert P by { s } and x := C.m(y) by { s } both seem worth having. I could imagine that they could become syntactic sugar for some more general construct in the long run, but having them on their own makes it so that you don't need to repeat (and do manual substitutions) into potentially large expressions.
I'd like to continue thinking about a more general construct that could be used to encode by clauses on both assertions and method calls, but I'm not sure we have quite the right thing yet.

Given all that, what I'd propose we do for now is pin down the semantics of by clauses on methods, implement it, and continue exploring the idea of a general-purpose construct that can encode both of the by constructs and potentially other use cases. I suspect that what we come up with won't change the semantics of either by construct itself, and will just broaden what we can say. And we'll learn more about what we want in the process of implementing and experimenting with the method-specific variant.

And, as for the expression forms, how about we discuss those in the context of a separate issue?

keyboardDrummer commented 1 month ago

Implemented by https://github.com/dafny-lang/dafny/pull/5662

dafny-lang / dafny