Custom view shows another contract and uses the wrong PC in the kcfg viewer

iFrostizz commented 1 year ago

Issue

When launching the kcfg-viewer, the custom view is not very useful because it rarely shows the correct executed part of the Solidity source. There are two reasons of that:

Only the testing contract is passed to the source map (1)
Source maps are indexed by instruction index, but are fetched by PC (2)

(1)

When calling the exec_foundry_view_kcfg function

https://github.com/runtimeverification/evm-semantics/blob/master/kevm-pyk/src/kevm_pyk/__main__.py#L630

We get the name of the testing contract

https://github.com/runtimeverification/evm-semantics/blob/master/kevm-pyk/src/kevm_pyk/__main__.py#L633

And only this one is passed to the source mapping structure

https://github.com/runtimeverification/evm-semantics/blob/master/kevm-pyk/src/kevm_pyk/__main__.py#L641-L642

Then if any external contract is called, we will not see the executed part in the source code

(2)

The instr_to_pc is correctly implemented but not used here

https://github.com/runtimeverification/evm-semantics/blob/master/kevm-pyk/src/kevm_pyk/solc_to_k.py#L296

Here

https://github.com/runtimeverification/evm-semantics/blob/master/kevm-pyk/src/kevm_pyk/solc_to_k.py#L324

We should probably rewrite it as:

_srcmap[instr_to_pc[i]] = (s, l, f, j, m)

Ideas

There is no way to know which contract is executed from the PC information only, because it may overlap (it starts at PC = 0 when executing a new contract).

Then when calling back _custom_view(), we may want to pass the contract name.

So a solution would be to allow writing the name of the contract that is currently executed in the K cell although not very desirable probably.

A better solution would be to write the bytecode of the contract that is currently executed in the node and we could fetch the contract, but if two contracts have the same bytecode then it may display the wrong one and may confuse the user.

palinatolmach commented 1 year ago

@RaoulSchaffranek do you have any source map-related information for this issue?

RaoulSchaffranek commented 1 year ago

In the debugger, I use the verbatim bytecode from the <program> cell to look up the contract entry in the standard JSON format. I haven't considered that two identical bytecodes could come from different source files - that's a good catch. But it is practically impossible when the IPFS hash of the contract metadata is appended to the bytecode (the default setting for solc). And even if we mix up two equivalent source files, the user can still make sense of the source maps - though I agree that showing the wrong filename is confusing behavior. In any case, I prefer that option over tracking source filenames in the semantics.

Also: Whenever we deal with source maps, we must ensure that the contract was compiled without optimizations.

palinatolmach commented 1 year ago

@iFrostizz I think we can go for Raoul's solution at the moment as it seems more practical. I'm not sure how often we encounter contracts with identical bytecode (and different contract names) within the same analysis and, as Raoul said, this solution would still provide meaningful source maps, which is probably most important (that said, I agree it's a very nice catch, and let's see if we encounter any issues like that in future).

iFrostizz commented 1 year ago

@palinatolmach I totally agree that it's better to have accurate source mappings at the price of the rare occurence of having another contract with the same bytecode appearing sometimes !

runtimeverification / kontrol

Custom view shows another contract and uses the wrong PC in the kcfg viewer #141

Issue

(1)

(2)

Ideas