0xPolygonZero / zk_evm

Apache License 2.0
74 stars 22 forks source link

Consider refactoring EIP 2930 access list lookups #200

Open frisitano opened 3 months ago

frisitano commented 3 months ago

Transactions can include EIP 2930 access lists containing address / storage key pairs which should be eagerly loaded into memory. In some cases these storage keys are never actually used by a transaction. As such both the kernel and the trace protocol are inefficient, and this potentially provides a DOS vector for the prover. We should consider refactoring this logic.

Quoted from @Nashtare: Yeah agree, we could try revamping this part to remove the overhead, especially as SLOADs are a bit heavy and could be seen as some DOS entry door for the prover. Would need to think more about it how sound it would be to store arbitrary dummy value in non-accessed storage slots.

4l0n50 commented 3 months ago

If the addresses/storage keys are never accessed you still need to charge gas for them? At least EIP 2039 description says:

we charge additional gas for the access list: ACCESS_LIST_ADDRESS_COST gas per address and ACCESS_LIST_STORAGE_KEY_COST gas per storage key.

That would mean that you always need to iterate over the whole access list. On the other hand, gas cost would impose a limit on the size of the list.

As such both the kernel and the trace protocol are inefficient

At least the kernel only calls %insert_accessed_addresses_no_return (other than iterating over the list which is necessary for computing the gas), which takes a constant number of cycles (I don't remember, but it was something like 40 cycles)

Nashtare commented 3 months ago

That would mean that you always need to iterate over the whole access list.

We always go through it anyway to parse each addresses / storage keys.

On the other hand, gas cost would impose a limit on the size of the list.

Yeah, but I think it may still be prohibitive (30M gas limit allows for something like ~16k storage keys). If they are badly spread, the overhead would be quite large.

At least the kernel only calls %insert_accessed_addresses_no_return (other than iterating over the list which is necessary for computing the gas), which takes a constant number of cycles

When parsing the txn RLP and decoding the access list, we then call %insert_accessed_storage_keys_with_original_value for each storage key of the decoded address, which induces an SLOAD.

4l0n50 commented 2 months ago

we then call %insert_accessed_storage_keys_with_original_value for each storage key of the decoded address, which induces an SLOAD.

Ahh, I see. I was missing that part. I suppose the easiest solution would be to wait until we solve #172.