Closed leonardoalt closed 7 years ago
To Question 1:
If the Solidity compiler uses INVALID
for undesired user input, that's a bug in the compiler. Please file an issue.
Recently, a similar issue was fixed.
To Question 2:
solc/solc --asm-json foo.sol
associates each instruction with a range in the srouce code.
{
"begin" : 39,
"end" : 43,
"name" : "PUSH",
"value" : "0"
},
means the PUSH
comes from the srouce code characters 39--43 (counted from zero). I'm not aware of any kind of variable->declration look up.
@leonardoalt awesome idea. We thought about implementing that feature previously but haven't got chance to address it yet. Would love to see your PR soon. Regarding your questions, I hope you are happy with @pirapira's answer. Otherwise, please let me know.
@pirapira Thanks for the answers!
I'll definitely try the solution for Question 2.
I'm a little confused by Question 1's answer though. If not INVALID, what is the correct bytecode for an assertion? Maybe I am also misunderstanding the meaning of an assertion in Solidity. For me an assertion represents a property that holds at that point concerning anything, where by "anything" I mean mostly correctness of a piece of code, not only undesired user input. I interpret that this view is supported by this short discussion: https://github.com/ethereum/solidity/issues/2118
@loiluu Cool! I guess I might have to change it anyway if INVALID is not the way to go with assertions. If you're interested, we can discuss some ideas to improve the tool on this direction.
@leonardoalt A correct bytecode for a require
is REVERT
opcode, which will be introduced in the Metropolis hardfork. Even before the Metropolis hardfork, we can use the opcode for REVERT
because that throws anyway.
Your interpretation is right. When an assertion fails, there is a bug in the Solidity code, the Solidity compiler, or the EVM implementation.
@pirapira Oh I see. Cool, gonna use that one then. Thanks again!
@leonardoalt regarding unbounded loop, we do allow you to specify the bound for the loop. Try the --looplimit
option.
I am happy to discuss more regarding this direction. Lets do it here so other people can join?
Would this work on bytecode too where assertions fail?
To be clear, assert(<condition>)
will do INVALID
while require(<condition>)
will do REVERT
. My understanding is that the intended use case is that assert
s should never be hit and require
may be hit during normal execution. For a tool like this, that means that what you want to look for is assert
s, not require
s.
@joeykrug yes, Oyente basically works on bytecode.
@MicahZoltu Exactly what I thought, thanks.
@joeykrug The idea is to verify the bytecode, but in case there is a bug, to try and map it back to the Solidity code (in case the source was written in Solidity).
@leonardoalt please let us know how we can help you on this issue!
@loiluu Thanks! I haven't been working on it since I opened the thread, but I'll implement the solution to Question 2 this week. Ideally this will take care of pointing to the function that contains the broken assertion. Any ideas on how to map values from the model back to Solidity variables? The first way that I thought would be to separate storage and local variables, could be a start.
Just wanted to comment and say this would be an amazing feature ^^
Cool, I see the PR regarding this topic was merged :)
Continuing the discussion here:
as I mentioned in the PR, there are cases that lead to false positives (false reports on assertion failures).
One of those cases that I am aware of is the use of ENUM
. It uses INVALID
a lot, so you might see false positives if you run Oyente with the option -a
on a code with ENUM
s.
The combination of this new code and the recent REVERT
implementation lead to a nice feature: one can use require
as a formal assumption on the code, which will be used as a constraint to verify the assert
.
Take the following code, for example:
contract T {
function f(uint x) {
require(x < 2000);
assert(x < 3000);
}
}
In this case, after the require
, x < 2000
is true for the rest of the function, therefore it is also < 3000
. If you change the require
line to require(x < 4000)
for example, the assertion is not true anymore, and Oyente will give you a model (even though in a not so nice format :p) which is a value for x
that satisfies the require
but breaks the assert
.
The hacks in my code to identify the use of INVALID
that does not come from an assert
might not be necessary anymore since those were probably bugs and have already been fixed in the newer versions of solc
. I'll check that this week.
I'm closing this issue since assertion checks are working well and assertion fails have been nicely mapped back to the Solidity assert
code with the recent work on Source Map. I haven't seen any false positives, but if I do I'll create a new issue.
Good work. Thanks @leonardoalt
Hi there,
I've extended Oyente a little bit in my fork (https://github.com/leonardoalt/oyente) to try and verify any Solidity assertion. An assertion is compiled by Solc as a simple jump to INVALID in case the condition fails. So I use the SMT branch queries to decide whether a certain symbolic path may reach an INVALID (an assertion failure), and if that's the case ask z3 for the model. This has worked pretty well for the examples I've tried, but of course if you introduce unbounded loops it's gonna explode.
The basic part is identifying assertions in the ASM code, since Solc uses INVALID for other cases as well, for example if the called function doesn't exist (this is always contained in Block 0), or if CALLVALUE is zero. So far I've been considering as assertions blocks that start with INVALID except the cases above. Question 1: Is anyone aware of a better way to detect assertions in the basic blocks?
If INVALID is reachable, the model is retrieved from z3 and printed. I also tried some tricks to automatically figure out in which Solidity function the assertion that failed is, by checking which blocks represent the start of each function (by using the signatures and hashes of the functions), building a reverse CFG and walking backwards from the block that contains the assertion. Of course this only guarantees what was the top-level function that led to the assertion, but its correctness is not guaranteed in case of nested function calls. Question 2: Does anyone have an idea of how I can map this information back to the Solidity source code in a better way? Ideally I'd like to map back symbolic to Solidity variables as well, but I know it's even harder.
Notice that I am trying to do all that without changing the language or the compiler, which would make it easier but less usable.
Cheers, Leo