streamingfast / firehose-ethereum

Ethereum on StreamingFast
Apache License 2.0
78 stars 24 forks source link

Ethereum Block Model Known Issues #71

Open maoueh opened 1 year ago

maoueh commented 1 year ago

This issue lists the currently know data issue with our Ethereum Block model.

Genesis block storage changes determinism issue

In the previous Firehose patch, we were not sorting genesis storage keys for each account leading to determinism problem on chain for which the genesis contains such definitions.

This is because the storage key/value is stored in a map[common.Hash]common.Hash map container which is unordered an may yield different results on each invocation.

We did not review each genesis config yet to see if there was such problem active today.

It's important to note that there is no problem with the data in itself, everything is there. It's just that two syncs of the genesis block might leads to different order of the StorageChanges kept in the root call.

Transaction root call BeginOrdinal is always 0

The root call BeginOrdinal value of all transactions in a block is always 0 today. This is a problem if you sort by call.BeginOrdinal, in that case relies on the fact that Calls are already sorted by execution order within a TransactionTrace.

You can "fix" the issue by doing trx.Calls[0].BeginOrdinal = trx.BeginOrdinal.

Genesis block root call missing BeginOrdinal/EndOrdinal

The root call of the genesis block didn't properly record an ordinal increment for BeginOrdinal and for EndOrdinal they will always be both 0.

There is a single call in a genesis block so this shouldn't pose any problem, the trx.Calls[0].BeginOrdinal = trx.BeginOrdinal and trx.Calls[0].EndOrdinal = max.Uint64 change can be applied to fix the issue.

Transaction#return_data is not populated

The field TransactionTrace#return_data is not populated by our patch currently.

Some Call#GasChanges are missing

While doing the Geth native tracer patch, I noticed we missed at least one place where gas change would have been meaningful. The gas refunded because of data returned to the chain is not tracked properly, but it's an important gas change, those happen at the very end of the transaction, just before the buy back.

Also, the new Geth tracer is also tracing the change from 0 -> <GasLimit> (initial balance) and the change from <LeftOver> -> 0 (buy back) which happens at the very beginning and very end of each call (buy back possibly not present if there is no gas left).

Call#input not populate on contract creation

On CREATE/CREATE2 call, we do not populate the call.input field. The contract's code is available in the repeated CodeChange code_changes field so we though we could reduce the block's size by omitting the call.input for those.

But this was a big oversight for contracts with a constructor. Indeed, in those case the constructor's parameter input is found in the call.input at the end of the EVM code, so we are not having it.

Note that for the root call CREATE operation, the input can be taken from the TransactionTrace object containing the call, this one is always properly populated. So it's a problem for smart contracts that deploys other smart contract(s).

Call#executed_code not true on some situation where it should

The Call#executed_code is instrumented https://github.com/streamingfast/go-ethereum/blob/51624bc5371590de0414116735634705fcbe2a45/core/vm/interpreter.go#L159-L165 which emits if no code was executed at all by the interpreter.

In the call start with do:

call.executed_code = call.type != CREATE && len(call.input) > 0

As the default value and then set call.executed_code = false when the account without code event is emitted by the tracer.

This is problematic for account that have code to "deal" with native transfer of value which is possible with Solidity. In those situations, we are not properly setting executed_code correctly.

The same would be true for a pre-compile address that accepts no input, but I'm unaware of such pre-compiles so I doubt this happen in practices.

TransactionTrace enforced state changes

This is a more quirk of the model than a bug in itself, but I'll list here.

The problem here is the fact that even for failed transactions there is some state changes that are still record to the chain namely: the payment of the gas and the nonce change are recorded on the chain's state even though the transaction failed.

Now the modeling problem is that we use call.state_reverted to determine if a call should be inspected but the logic does not apply to every use cases, especially not those dealing with balance changes or nonce changes.

We should maybe have moved those "fixed" changes in the TransactionTrace model.