Closed pirapira closed 5 years ago
Pinging @daejunpark and @ehildenb.
We were talking about this a few months back. The only difference is we were talking about a special comment type for analysis. I think everyone was on board with the concept, but this particular implementation is interesting.
I think a more succinct way to do it is to include valid Python syntax that basically won't compile to anything in "normal" mode, and in static analysis mode would add extra asserts to the bytecode used for analysis, throwing the special opcode as you pointed out.
We would love to leverage this kind of analysis as a tool for our compiler. Perhaps this specification can be leveraged as a standard for any EVM language, and we can build toolsets that make this kind of analysis easier within a language-agnostic framework?
I think with all of our langauges targeting the EVM, we should be thinking much more about toolsets that leverage that layer so we're not duplicating work in every compiler.
Yes, a common convention would save work. Solidity took the approach "INVALID
means never-to-be-executed-if-program-is-sane".
Properly functioning code should never reach a failing assert statement; if this happens there is a bug in your contract which you should fix.
I'm in favor of the INVALID
opcode means "never to be executed, if so then program has bug". It might not be so with legacy contracts, but I think it makes sense moving forward.
Alternative is making some explicit ASSERTION_FAILURE
opcode, but if it would have same runtime behaviour as INVALID
I don't see the benefit. If we can think of a different runtime behaviour for it, then it should be considered.
Another alternative is using some unused opcode as the de-facto ASSERTION_FAILURE
opcode. This will have the same runtime behaviour as INVALID
though, and may not be future-proof.
Yeah, that's a toughy.
Perhaps the code gets added when compiled for FV mode (in either language) and the json assests file contains a listing of those bytecode locations that should trigger analysis failures if reached.
That would start to get into the need to formalize the assets file spec to leverage in different tools (e.g. solc --standard-json
/vyper -f json
) so that bytecode and metadata could get imported into tools in a standard way.
What's wrong with just generating INVALID
when the code has an assert ...
(or sure ...
as @pirapira has above)? I suppose you could only insert it when in FV mode, but I think you would actually want to always insert it, even in production mode. If an assert ...
failure is triggered at runtime, you want the contract to throw because people may take advantage of the bug the assert ...
is guarding from.
I don't see the need for reading extra metadata, the tools can just analyze "is INVALID
ever reachable?", which is a fairly simple query to make for a symbolic execution engine, and I think most static analysis tools should also be able to do something for that style of query.
(I saw assert
was already taken, so I sidestepped.)
@ehildenb It's a suggestion. Some peoople don't want extra code to pollute their contracts, hence having a separate FV-only statement that would get dropped in normal compilation because you've "proved" it's not possible. We could definitely argue whether to give the person the option to include them in their final bytecode.
As far as metadata, it's more a UI thing. Yes the execution engine can tell when it hits a certain statement, but "there's a problem" is more opaque than "there's a problem L17, C40" or "your FVassert statement on line 17 failed"
@fubuloubu well, programmers have the choice whether or not to use the new sure
statements.
If some programmers go for
# @static-assert balance > 0
that's their choice.
sure x
--> INVALID
is just one way, very easy for static analyzers to understand.
Well, in that case, wouldn't it make sense to have assure
use the ASSERT
under the hood and additionally generate the @static-assert
for FV. That way we have three options:
assert
-> always checkassure
-> always in bytecode, should never happen, FV mode switches this to INVALID
for analysis#@static-assert
-> never check, FV only, FV mode adds INVALID
for analysis@fubuloubu well, #@static-assert
is a comment, so it should never translate to any instruction. However, #@static-assert x
can be translated into some sort of metadata, which is out of scope of my proposal. My point is that designing a common format of such metadata is hard, while it's easy to set up tools to detect INVALID
.
@fubuloubu I won't believe that the FV mode and non-FV mode behave the same, so if I analyze the FV mode code, I would just deploy the FV mode code.
@fubuloubu so, I think it's safer to have assure
or not to have assure
. ~disable assure
keyword in the non-FV mode. The compiler should error out ("error: assure
keyword is only available in the FV-mode").~
Then you have two copies of the same code (one with the statement and one without), and that's a no-no to me. I think maybe make it more obvious that they're different, and that @assure
will be removed in non-FV mode.
Well actually, I screwed that up. assure
would always generate bytecode, just use two different opcodes depending on mode, #@static-assert
would only get generated in FV-mode
I will only have one copy of the code with assure
, and I only use ~the FV~ (EDIT) one mode.
I'm not going to use #@static-assert
because I don't know what it does.
@fubuloubu if assure
generates different opcodes depending on mode, I won't trust that the two different versions of the generated code behave the same.
So I'm against any mode thing. That ambiguates the semantics of Vyper.
If a programmer doesn't like INVALID
, they don't use assure
.
Hmm, that's a good counterargument re: style. Don't make it complicated right? So people use it correctly.
INVALID
strikes me as a bit of a kludge though. Perhaps Vyper generates metadata for the FV tool about which asserts are to be used for what purpose?
Perhaps Vyper generates metadata for the FV tool about which asserts are to be used for what purpose?
That sounds like the ultimate right thing to do. The proposal here is made for impatient developers of static analyzers who want to try out their techniques today.
Well, I'm not sure extra meta-data will be necessary. @pirapira is it the case that Solidity will only generate INVALID
for failing asserts? And right now Vyper doesn't generate INVALID
at all. So if it's never used for anything else, then it seems like a natural use case.
But, I think either way, we should move forward with generated INVALID
, then see how the tools can take advantage of it and how they are limited by not knowing the reason for the INVALID
(though, I also think this won't be true, with KEVM you will just be able to look at the resulting final state and see the reason for the most part, and I think other symbolic execution engines will be similar). But if there are limitations uncovered, then we will have data supporting how to structure metadata around the invalids.
@fubuloubu by the way, never-failing INVALID
is cheaper than never-failing REVERT
.
Yes, the runtime cost of INVALID
is higher (if executed) because INVALID
depletes all remaining gas, but remember that never happens. So, the runtime cost is equal (these instructions are never executed, so zero).
The deployment cost is cheaper with INVALID
because INVALID
takes no inputs, and the program is shorter. REVERT
takes two stack elements, so you need to PUSH
those on the stack, so the program is longer. So, the deployment cost is more expensive with REVERT
.
@ehildenb yeah, Solidity generates INVALID
for failing asserts. @chriseth sounded glad you're going to use it.
Interesting discussion, so I prefer only having a single assert type statement in the base language. It feels like we are altering the base language just to accommodate static analysers? If that is the case, I opt for perhaps a secondary parameter to assert that convert the assert to use invalid opcode. Something like assert False, 'invalid'
and additionally we bury the use case somewhere deep in the documentation so no newcomers accidentally use it. But assure
or sure
also works just fine - as long as we don't really advertise it's usage for newcomers.
I think this misses the point though. The sure ...
statement (or whatever it ends up being called), is never to be reached, and if it is reached then something has gone horribly wrong and state should be reverted. This is useful in production code in its own right, without even thinking about symbolic execution or static analysis engines. The fact that it makes it easier to feed queries into these engines is just an added bonus (which admittedly, is what prompted @pirapira and I to begin discussing it).
@ehildenb I am probably missing something (very very likely :stuck_out_tongue_winking_eye:), but as far as I understand these never-to-be-reached if statement could just as well use assert?
@jacqueswww good point. There have been confusions around assert()
and require()
in Solidity. For me, a less advertised syntax works. If the static analyzers prove themselves useful, users will ask for more convenient syntax. The important thing for me right now is to give some chance for static analyzers to demonstrate their usefulness.
I like the word assure
as more violent/stronger keyword to assert
if we do go with this route ;)
@jacqueswww thanks for the suggestion. I updated the description.
I think it's less violent haha, but I like assure
a lot. Captures the point well: "give me assurance this thing right here is true, just in case...."
@pirapira I see your point now about REVERT
vs. INVALID
re: additional gas loss risk being acceptable (as it should never be triggered). :+1:
From meeting: Create an experimental keyword (warning in compiler 'NOTE: you are using experimental features' ) assure
that can be tested.
@fubuloubu I think it was you who previously suggested should
for this when I brought it up a while back. As I think @pirapira might've been getting at, while there are still doubts about the effects on careless programmers (@ehildenb), something unnecessarily verbose like shouldhold
would be perfectly fine for FV people, so might be a good choice temporarily.
I suggest both assure
and should
. The only difference is that for assure
, the main Vyper compiler will reject a program if it doesn't find a proof in a special auxiliary file (or doesn't find a short proof with an SMT solver, etc). should
or shouldhold
, on the other hand, expresses the developer's intention perfectly, while making clear to anyone reading the code that they can't count on the statement having actually been formally verified (a point made to me by Accord Project's Ergo developer Jerome, when I started this same conversation with them recently).
Would like to note the use cases of this for defining invariant conditions inside code for property-based testing (more here), and checks SMT/SAT solvers and other symbolic analyses and formal verifications
Note: most tools use the 0xfe
opcode (designated INVALID
) to check for these conditions:
https://twitter.com/vwuestholz/status/1102587249521319936
Yep that's the plan.
Actually will put this on my queue now :)
TODO Document new functionality, and give an overview of how it may be used effectively.
As discussed onthe Bi-Weekly meeting I prupose we instead extend assert instead
def test(a: ):
assert self.cond != 1, UNREACHABLE # Uses INVALID opcode.
@fubuloubu @charles-cooper I will leave this here for 24 hours, so we can make sure we have made a good call, and then I will amend my previous PR to use the above syntax instead.
That looks like a nice feature!
I am not entirely sure about the UNREACHABLE
syntax though, it might be confusing for the reader. I like having two functions, one for inputs validations, and one for holding the code properties. In theory, if your code is bug-free, you could just remove all the code checking the properties (while you need to keep the input validations).
So something like assure
/ensure
/ @ensure(..)
might be less error-prone, and helps the reader to clearly define what is an input validation check, and what is a property.
It would be nice also to have a compilation mode where these properties do emit bytecode. Typically, if I can prove that all my properties are safe, I would want the compiler to not produce code for them.
If we have assure
/ensure
/@ensure(..)
syntax, we could have an optional argument: generateCode
/isProofed
or something like that, which I can set to False/True
once I proved it, to indicate to the compiler to not generate code. This argument could even be set by a third-party analyzer.
Preamble
Simple Summary
Add a special type of assertion that should never ever fail. When a static analyzer can fire it, the program surely has a bug.
Abstract
Static analyzers can detect bugs, but only when the desired properties are specified. The easiest way is to insert a "never-to-fail" assertions in the program.
Motivation
Mythril has a feature to detect when 0xfe is somehow reachable. KEVM can soon do that.
assert
in vyper is expected to fail sometimes, and it is compiled toREVERT
.REVERT
might indicate a mistake of the caller, a mistake of the programmer, or a mistake of the compiler. So, static analyzers cannot shout "this is a bug".Instead, if Vyper has a never-to-fail assertion that translates to
INVALID (0xfe)
, static analyzers can confidently shout "this is a bug. See, this execution reachesINVALID
".Alternatives are specially formatted comments like ACSL or JML, but they are hard to learn.
Specification
Add a additional custom constant to the
assert
statement:as well as
Backwards Compatibility
A program containing a name
assure
will cause compillation errors.Copyright
Copyright and related rights waived via CC0