TrueBlocks / trueblocks-core

The main repository for the TrueBlocks system
https://trueblocks.io
GNU General Public License v3.0
1.04k stars 194 forks source link

chifra scrape - performance #3241

Open tjayrush opened 9 months ago

tjayrush commented 9 months ago

This issue is delayed. We will get to it one day.

Summary: scraping is not much slower than it used to be even though we query more data.

New scraper is a little bit slower than the old scraper, but not as much as I feared, especially when we consider that we're doing more querying than we were before (for withdrawals, for example).

Reth may be faster than Erigon, but the machine I was on was quite a bit faster, so maybe not such an accurate statement.

Reth does not support trace_filter. On Erigon, using trace_fitler against a 50 block range appears to be about five times faster than 50 individual queries for single trace_blocks (which makes sense). And this effect is more pronounced when going over the wire.

Using trace_filter with a block range (the data gets very big, so experiment with the optimal range size) seems to be a more productive use of our time than use eth_getLogs with a range. If we had to choose one, I'd choose trace_filter. But, trace_filter and eth_getLogs may take the same range in which case it may be a two-for-one deal.

About 37% of the time during scraping is non-rpc-query related. 54% is querying traces. 6% querying withdrawals and about 1% querying logs which argues, again, for optimizing traces.

rpc          ,client       ,branch      ,node      ,time
----------   ,----------   ,---------   ,--------- ,---------
laptop       ,laptop       ,fix         ,reth      ,116.04
laptop       ,laptop       ,master      ,reth      ,118.31
laptop       ,wildmolasses ,fix         ,erigon    ,xxx
laptop       ,wildmolasses ,master      ,erigon    ,xxx

wildmolasses ,laptop       ,frame       ,reth      ,          ,       ,37.2%
wildmolasses ,laptop       ,fix-notrace ,reth      ,7470      , 45245 ,54.7%
wildmolasses ,laptop       ,fix-nowith  ,reth      ,15420     , 93398 , 6.6%
wildmolasses ,laptop       ,fix-nologs  ,reth      ,16260     , 98486 , 1.5%
wildmolasses ,laptop       ,fix         ,reth      ,16510     ,100000 ,     

wildmolasses ,laptop       ,master      ,reth      ,358.02
wildmolasses ,wildmolasses ,fix         ,erigon    ,150.28
wildmolasses ,wildmolasses ,master      ,erigon    ,144.93
tjayrush commented 9 months ago

I removed the block query for withdrawals from pre-Shanghai blocks, so that will help. The scraper is "not so bad" even given that it's querying more (withdrawals and receipts).