crytic / ethersplay

EVM dissassembler
GNU Affero General Public License v3.0
834 stars 116 forks source link

LLIL lifting progress #25

Closed f0rki closed 6 years ago

f0rki commented 6 years ago

This PR contains some progress with lifting EVM instructions to LLIL. I think all of the instructions are modeled in terms of their usage of the stack. MSTORE, MLOAD, SSTORE, SLOAD are not correctly implemented because of limits of binary ninja (see #15). I tried to split the address space by adding huge offsets (e.g. 2**256 for all addresses in EVM memory space), but apparently that doesn't work, because the pointer offsets can't be that big.

However, the current state is still useful. I created a small plugin command that uses the partial emulation of binary ninja to get possible stack contents for interesting instructions, such as e.g., SSTORE. It looks like this: 2018-03-25-212945_728x144_scrot Here the address can be inferred, but the value depends on input/computation and is therefore unknown.

This PR is not ready for merge, this is primarily for getting some feedback now. I moved quite some code around. So I'd appreciate some feedback. Also there seems to be a bug with LLIL emulation in the current binary ninja dev build (see this issue), so I wouldn't merge until that is resolved.

joshwatson commented 6 years ago

We’ve been refactoring it with the lifter in a different branch, as well as refactoring the disassembler, so we’ll hold off with this until we merge those changes into master. Some of this is likely duplication of effort, but we’ll incorporate anything else that’s good 🙂

f0rki commented 6 years ago

Is the branch with the refactoring public?

joshwatson commented 6 years ago

Yes it’s the refactor branch

f0rki commented 6 years ago

Ah, I didn't notice that branch. That's quite some refactoring. However, as I can see there are quite some EVM instructions still missing in the refactor branch, right? I'm sure you could use that part from my PR.

Some things I do not know whether they are useful:

dguido commented 6 years ago

Hey @f0rki, I'm not sure if you're still working on this but if you are, please sign our CLA. Let me us know if you need any assistance, we're all on Slack and excited to continue working on Ethersplay.

Click through this link to sign the CLA: https://cla-assistant.io/trailofbits/ethersplay?pullRequest=25

withzombies commented 6 years ago

Unfortunately, properly modeling EVM is not possible with Binary Ninja IL. Theo, Josh, and myself have all been down this route and come up short. Additionally, I've discussed this pretty extensively with the Vector35 guys and it's not going to happen. I'd rather not merge in incomplete support, so I'm going to close this PR.

If it becomes possible, I think we will definitely take this approach.

f0rki commented 5 years ago

@dguido not so much anymore. I'll sign the CLA when something might actually merge.

@withzombies That is unfortunate. I found ethersplay to be pretty helpful for manually reversing contracts. Especially after I modeled most of the instructions. It was super helpful to recover constant SLOAD or MLOAD addresses. So I might disagree a bit, in that partial modeling/lifting is also useful :)

I'm guessing the main issue is that the EVM uses separated address spaces for everything? Maybe you can explain a bit more why it isn't possible? I'd certainly be interested what the main issues are.