Closed maoueh closed 1 year ago
There is still a fair amount of cloning happening when doing the trace.clone()
.
I think we need to re-think our substreams-<chain>
library so that main methods are more "destructive" on the block using most of the Rust move semantics.
For example if all_transaction_traces
was fn all_transaction_traces(block: Block): impl IntoIterator<TransactionTrace>
this would lead to code like:
#[substreams::handlers::map]
fn map_action_traces(block: Block) -> Result<ActionTraces, Error> {
let mut action_traces = vec![];
for trx in block.all_transaction_traces() {
for trace in trx.action_traces {
action_traces.push(trace);
}
}
Ok(ActionTraces { action_traces })
}
This is a much more efficient version than before as no cloning happen and memory structure are simply moved around. I'll perform soon WASM benchmarks around what's the impact of excessive cloning on big block, maybe it will proved negligible (I personally think it has a good impact).
I see, yeah let's move it.
This is what you mean @maoueh #8?
Here some results of excessive clone()
. Here you will see a baseline with no clones and one where blk.clone()
as well as each trx_trace.clone()
for an good sized Ethereum Mainnet block:
│ /tmp/run_baseline.txt │ /tmp/run_clone_block_and_each_trx_trace.txt │
│ sec/op │ sec/op vs base │
Execution/vm=wazero,instance=reused,tag=map_block-10 6.674m ± 1% 9.521m ± 0% +42.66% (p=0.000 n=10)
So a 43% decrease in performance.
Cool. Good to know.
Doing just the trx_trace.clone()
was decreasing by ~24%
The main example show cases:
When the
block.clone()
will perform a deep clone of the full block in memory which is highly undesirable and for all of the transaction trace. Where later each action trace is itself cloned again.Pure view function should be used or
into_iter()
can be used to "move" transaction and action out of the array moving them instead of copying of referencing.I did not had the chance to review the Rust code, but I'll happily guide/help anyone who is doing it.