channgo2203 / fevm

Formalization of EVM in Coq
5 stars 1 forks source link

Reasoning about reentrancy sceinarios #1

Open pirapira opened 8 years ago

pirapira commented 8 years ago

This is an enhancement request for dealing with the following sceinario. Sometimes when a program running for an account A CALLs another account, the other account might call back A. In this case, there are multiple EVM instances running at the same time with independent stacks and independent memory contents, but shared storage.

As a result, the original instance might observe some change in the storage after a successful CALL. I think there are many ways to formalize this phenomenon: maybe by using a relation of states (pre, post) that captures possible state transitions before/after executing the program. Or, maybe by explicitly model the nested multiple EVM instances.

I think the choice affects the design how to model the states.

channgo2203 commented 8 years ago

Yes, it is exactly what I was thinking about. I intend to do as: for each call instruction, we create a new instance of EVMachine record (in evm.v). EVMachine consists of its own stack, memory, storage, and external running environment.

I remember I asked 1 question on ethereum stackexchange about this situation. Since in the yellow paper, it specifies that account A calls account B and B's code is executed. A must wait until B's code terminate normally or there was an exception. The issue raises when B calls back to A (the same function that has called B's code). It seems that we have dead-lock here.

I was thinking how we can verify formally this situation (dead-lock) also.

pirapira commented 8 years ago

In case of callbacks, common implementations don't have dead-locks but multiple EVM instances on the same account. The situation is called often called reentrancy. There might be a discrepancy between the yellow paper and these implementations.

channgo2203 commented 8 years ago

Thanks for pointing that. I'll look at the notion "reentrancy". However, consider again the case: Let's say contract A has function fa, contract B has function fb. First fa calls fb then fb calls fa back. Thus we have the following chain (---> means function call, instance means EVM instance): (instance 1 of A) ---> (instance 1 of B) ---> (instance 2 of A) ---> (instance 2 of B) .... It seems to be non-termination (in other words, dead-lock). Maybe I don't understand correctly.

pirapira commented 8 years ago

Now I see what you meant. I remember there is a notion of the call-stack. The call-stack would contain [... B, A, B, A] in your example. Its depth is capped at 1024.

channgo2203 commented 8 years ago

I see, thanks.

The implementation is almost done for basic instructions. It can run a subset of programs, e.g. the following one https://github.com/ethereum/wiki/wiki/Ethereum-Development-Tutorial

The output strings for debugging look like: Initial EVM = "(Gas available: 0000000000000000000000000000000000000000000000000000000000000000,PC: 0000000000000000000000000000000000000000000000000000000000000000,Memory: ,Active Memory: 0000000000000000000000000000000000000000000000000000000000000000,Stack: ,External Running Environment: (Ia: 0000000000000000000000000000000000000000,Io: 0000000000000000000000000000000000000000,Is: 0000000000000000000000000000000000000000,Ip: 0000000000000000000000000000000000000000000000000000000000000000,Id: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 69 D6 24,Iv: 0000000000000000000000000000000000000000000000000000000000000000,Ib: 60 00 35 54 19 60 09 57 00 5B 60 20 35 60 00 35 55 ,Ih: 0000000000000000000000000000000000000000000000000000000000000000,As: ,Al: ,Ar: 0000000000000000000000000000000000000000000000000000000000000000,Ie: 0000000000000000000000000000000000000000000000000000000000000000),Storage: )"%string

: string

Bytecode = "PUSH1 00 CALLDATALOAD SLOAD NOT PUSH1 09 JUMPI STOP JUMPDEST PUSH1 20 CALLDATALOAD PUSH1 00 CALLDATALOAD SSTORE "%string

: string

Final EVM = "(Gas available: 0000000000000000000000000000000000000000000000000000000000000000,PC: 0000000000000000000000000000000000000000000000000000000000000011,Memory: ,Active Memory: 0000000000000000000000000000000000000000000000000000000000000000,Stack: ,External Running Environment: (Ia: 0000000000000000000000000000000000000000,Io: 0000000000000000000000000000000000000000,Is: 0000000000000000000000000000000000000000,Ip: 0000000000000000000000000000000000000000000000000000000000000000,Id: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 69 D6 24,Iv: 0000000000000000000000000000000000000000000000000000000000000000,Ib: 60 00 35 54 19 60 09 57 00 5B 60 20 35 60 00 35 55 ,Ih: 0000000000000000000000000000000000000000000000000000000000000000,As: ,Al: ,Ar: 0000000000000000000000000000000000000000000000000000000000000000,Ie: 0000000000000000000000000000000000000000000000000000000000000000),Storage: 0000000000000000000000000000000000000000000000000000000000000036:=000000000000000000000000000000000000000000000000000000007869D624, )"%string

pirapira commented 8 years ago

Ah nice. What you can do is to extract the code into OCaml and start building a Coq-OCaml Ethereum client, on which you can write proofs.

channgo2203 commented 7 years ago

Hi Yoichi,

Sorry for my late response. I have a question related to formal verification jobs at Ethereum Foundation. I am plan to develop a model checker for EVM byte-code with automated solver (SMT solver Z3). Is Ethereum Foundation interested in this direction? I am looking for a job in formal methods after my postdoc at CMU.

Thank you!

pirapira commented 7 years ago

I sent you a message in gitter.im.

vietlq commented 6 years ago

I recommend looking at https://www.zenprotocol.com