paradigmxyz / cryo

cryo is the easiest way to extract blockchain data to parquet, csv, json, or python dataframes
Apache License 2.0
1.12k stars 97 forks source link

Utilize / POC reth database directly instead of reth RPC (as an alternative datasource) #156

Open liangjh opened 8 months ago

liangjh commented 8 months ago

Is your feature request related to a problem? Please describe. Could we improve performance with a cryo extension (or option) to utilize the underlying reth node database for syncing data?

Describe the solution you'd like As a user, I'd like an option in the CLI (or programatically) to choose between utilizing the reth RPC vs. a local reth node database. Understood that this won't allow for remote nodes and non-reth implementations (geth etc) but would it be a faster option for those who do have the local reth node. Perhaps a simple POC to read from local reth node database as a first step? Then can decide whether this is a good solution to pursue / build upon etc. Expose as an option (experimental). Default behavior will obviously utilize RPC.

Describe alternatives you've considered Believe this is the alternative approach to the current RPC centric solution.

Additional context The reth-indexer takes this approach of reading the local reth database. [@]gakonst has commented in the past re: reading directly from reth database as a speed improvement.