Open shazow opened 1 year ago
I'm assuming the RETURN
opcode with non-zero size will indicate if a function returns a value, but relying on that means we'd need to construct instruction ranges for each function (should be possible assuming the selector table yields back-to-back functions). [Update: This looks fine]
On the upside, that should be sufficient to give us the return size, which is often a good proxy for guessing what the type is (e.g. 160 bits -> probably address). [Update: This is false]
Using the dummy output value of [{type:"byte32"}]
seem to work to get at least a "readable" value for uint256
and address
types. string
, gets butchered and probably tuples etc. as well.
If a function returns a size that is larger than bytes32
, what's a good strategy for returning an undecoded type to fit it? Like say it's 32+16+32 = 80 bytes (but we don't know the layout, we just see 80 bytes). Naive approach feels like returning 32,32,16 (basically binpacking from largest to smallest). Is there something better we could do?
Or maybe it's better to just use string
type for anything >32?
Started a WIP PR in #14, here are the vibes so far (from PR):
Still in the research phase, trying to find a way to detect output sizes but that's looking harder than I hoped.
It looks like modern solidity wraps most outputs through a chain of jumps that prepares the data. It's going to be quite hard to do this with a single-pass static analysis.
Older solidity (e.g. WETH contract with v0.4.x) does a simpler return macro per function window, those aren't hard to detect but extracting sizing reliably still seems hard.
Also I thought it'd be easier to detect address type outputs because they're 20 bytes rather than the usual 32, but I forgot that things get padded so it still ends up being 32 bytes.
I probably need to sleep on this in case there's other clever solutions but not looking great for single-pass static analysis right now. 😅
Updated the current state and challenges in the issue description, going to pass it around to some folks to see if anyone else has ideas. Feel free to re-share. :)
I just merged a branch which does more advanced static analysis into master, haven't done a release yet.
In some cases, it manages to successfully guess whether there are inputs or outputs (not super reliable, I'd say like... 60%?), but there have been major changes behind the scenes with how the static analysis works so we can do more advanced things moving forward.
Also we now have stateMutability
included in the ABI, which is reliable in detecting payable functions, but not reliable in distinguishing nonpayable vs view yet.
Would appreciate some testing and feedback before I do a proper release. :)
Next release issue is here: https://github.com/shazow/whatsabi/issues/18
Unfortunately selector hashes don't include the return value, so none of the 4byte databases include return types.
Questions:
What we have:
Updated challenges:
RETURN
from the end of each selector function's boundary.STOP
branch, which shouldn't be too hard to find in isolation (basicallyJUMPDEST STOP
, sometimes there are multiples, not sure why). Could we just use the absence of aSTOP
orJUMP
to aSTOP
offset as an indicator whether there is a return value of somekind?