alloy-rs / alloy

Transports, Middleware, and Networks for the Alloy project
https://alloy.rs
Apache License 2.0
476 stars 160 forks source link

[Feature] Add a `CachedReqwestProvider` to cache RPC requests using a `ReqwestProvider` #770

Open puma314 opened 2 months ago

puma314 commented 2 months ago

Component

provider, pubsub

Describe the feature you would like

For use-cases like SP1-Reth or Kona, we often want to execute a (historical) block, but we don't have the entire state in memory and we execute this block with a ProviderDb that fetches accounts, storage, etc. using an RPC. Fetching from the network is slow and often takes minutes for all of the accesses required for an entire block.

Often we re-run these blocks to debug things or tune performance, etc. and each time the feedback loop on iteration is very slow because it requires waiting for all the network requests each time. It would be nice to add a very simple caching layer on top of ReqwestProvider that can cache the results of RPC calls to a file (or some other easy to set up format) and then first check the cache before sending a network request.

This would speed up iteration time for use-cases like Kona and SP1-Reth tremendously.

An interface like this might make sense:

let provider = ReqwestProvider::new_http(rpc_url).cache("my_file.txt")

In our case, we are usually querying old blocks (not near the tip of the chain), so re-org awareness is not important for our use-case. We just want a really simple caching layer.

Additional context

No response

gakonst commented 2 months ago

Could this be a tower layer?

Seeing https://docs.rs/tower/latest/tower/ready_cache/cache/struct.ReadyCache.html - cc @mattsse does this work?

puma314 commented 2 months ago

I'm not fully sure I understand how the tower works, but noting that we'd want to save stuff to a file so its persisted across instantiations (and not just have the cache in memory as an example).

prestwich commented 2 months ago

We won't add caching at the Transport layer via tower because caching (unlike rate limiting or retrying) needs to be aware of the RPC semantics and potentially the provider heartbeat task, so that it can invalidate caches on new blocks and reorgs. This means we need it to be a provider alloy_provider::Layer producing CachingProvider<P, T, N>, rather than a tower::Layer producing CachingTransport<T>.

This is blocked by #736 (which is pretty straightforward to resolve)

Is the use case here making a high volume of requests against specific deep historical states? It sounds like you actually don't want to cache to a file. You want an in-memory cache that is persisted to a file when your program stops? I'm in general not in favor of caching to/from a file directly, as responses get invalidated so regularly, fs access degrades perf, and the target user for alloy doesn't have an archive node and doesn't make queries against the deep state. Would it be enough to have the cache internals be (de)serializable and a way to instantiate the cache with data in it?

gakonst commented 2 months ago

This means we need it to be a provider alloy_provider::Layer producing CachingProvider<P, T, N>, rather than a tower::Layer producing CachingTransport.

Good point, supportive.

It sounds like you actually don't want to cache to a file. You want an in-memory cache that is persisted to a file when your program stops?

@puma314 basically this means:

  1. first run you start with no cache file on disk
  2. first request goes to RPC, gets cached
  3. second request goes to the cache
  4. when you ctrl +c the cache's drop impl gets called, persisting everything to disk
  5. when you start up the process again, the entire file is loaded in memory OR the data is "just in time" loaded from the file, either would work i think
puma314 commented 2 months ago

Yup that sounds great. @prestwich our use-case is that we are querying getProof and getStorage on blocks potentially hours, etc. in the past (so blocks that are well past the reorg window). We are using this for generating a ZKP, so we wouldn't want to generate a ZKP of a block that could be re-orged, if that makes sense.

@gakonst's proposed suggestion looks great to me as a potential devex.

prestwich commented 2 months ago

when you ctrl +c the cache's drop impl gets called, persisting everything to disk

serialization and fs ops are fallible and cant be reliably used in a Drop. so I wouldnt recommend this approach

More broadly tho, a file system-backed cache of finalized responses is not broadly applicable and requires us to make decisions about the user's fs. I am not in favor of including it in the main alloy crates. A memory cache that can be loaded from fs at runtime and serialized to fs on demand is applicable to a lot of users, and could be in the main provider crate. Would that fit your need?

Assuming you're running your own infra, the need may also be better served by accessing reth db or staticfiles directly? If running alongside reth, retrieving proofs and then storing them to the file system is duplicating data that's already in the file system, no?

gakonst commented 2 months ago

serialization and fs ops are fallible and cant be reliably used in a Drop. so I wouldnt recommend this approach More broadly tho, a file system-backed cache of finalized responses is not broadly applicable and requires us to make decisions about the user's fs. I am not in favor of including it in the main alloy crates.

I've used this method before multiple times for debugging (e.g in MEV Inspect) and it's generally been fine, so I personally don't worry about the fallibility, but OK with doing this as a separate crate.

A memory cache that can be loaded from fs at runtime and serialized to fs on demand is applicable to a lot of users, and could be in the main provider crate. Would that fit your need?

How should the cache be populated in this case? Still via ProviderLayer where each method populates an LRU of the data on cache miss? And is it responsibility of the user to flush the cache to disk?

Assuming you're running your own infra, the need may also be better served by accessing reth db or staticfiles directly? If running alongside reth, retrieving proofs and then storing them to the file system is duplicating data that's already in the file system, no?

Proofs aren't part of the Reth DB, they get generated on the fly, don't think this would work

puma314 commented 2 months ago

A memory cache that can be loaded from fs and saved to fs would work for me. I'm not running my own infra in this case--the point is that for basically any chain we can get all the storage slots & proofs for running a block in a zkVM, without the need to have a local node running that is synced for that chain. It's a lot lower friction if we can just plug in an RPC vs. having to sync a reth instance. (Also I'm not sure if reth has getProof implemented yet).

let mut cache = MemoryCache::load("file.txt");
let provider = RequestProvider.(...).with_cache(cache);
// do stuff with provider
cache.save("file.txt")

seems totally fine to me.

gakonst commented 2 months ago

SG re: the API above! Confirming that if you do stuff with provider that hit the actual backend and not the cache, the new file.txt should 1) include all the requests which were not cached before, 2) all the previous contents of the cache?

eth_getProof is implemented in Reth, but not the historical variant for arbitrary lookback due to limitations of the Erigon DB design which we inherit.

prestwich commented 2 months ago

I've used this method before multiple times for debugging (e.g in MEV Inspect) and it's generally been fine, so I personally don't worry about the fallibility, but OK with doing this as a separate crate.

Panics in drops cause aborts, so you can do it, but it's not a decision we want to make on behalf of all users, as we don't know what conditions they're running in

A memory cache that can be loaded from fs and saved to fs would work for me. I'm not running my own infra in this case--the point is that for basically any chain we can get all the storage slots & proofs for running a block in a zkVM, without the need to have a local node running that is synced for that chain. It's a lot lower friction if we can just plug in an RPC vs. having to sync a reth instance. (Also I'm not sure if reth has getProof implemented yet).

let mut cache = MemoryCache::load("file.txt");
let provider = RequestProvider.(...).with_cache(cache);
// do stuff with provider
cache.save("file.txt")

seems totally fine to me.

instantiation should run through the builder API, so the sketch here is something like:

/// Cache object
struct Cache { ... }
/// Caching configuration object
struct CachingLayer { cache: Option<Cache> // other fields? }
/// Provider with cache
struct CachingProvider<P,N,T> { inner: P, cache: Cache }

let provider = builder.layer(CachingLayer::from_file("file.txt")?).http(url)

do you have a ballpark for number of proofs/etc you intend to cache?

puma314 commented 1 month ago

I think we would need low 100s of proofs per block, since it's all accounts/state that was touched during a block.

prestwich commented 1 month ago

so i think actionable steps for implementing this are: