hashgraph / hedera-json-rpc-relay

Implementation of Ethereum JSON-RPC APIs for Hedera
Apache License 2.0
50 stars 68 forks source link

Configure cache control headers with reasonble cache-ttls #714

Open mshakeg opened 1 year ago

mshakeg commented 1 year ago

Problem

No cache control response headers are set for methods that make sense to do so, for example, a method always returns the same data such as eth_chainId or is only infrequently updated such as web3_clientVersion. However due to the id request nonce attached in the http request body being incremented on every new request the CDN should only look at the method name in the request and return a cached response with the same id in the response. This may not be possible on any major CDNs as they do not allow constructing the cache key from the request body or modifying the cached response.

The primary motivation for this issue is to minimise hitting rate & request limits as the ethers provider seems to make an unnecessary number of eth_chainId calls, even when the chainId is passed on provider instantiation.

According to this reply ethers@5.0.3 exposes StaticJsonRpcProvider which would only make a single eth_chainId request at most and use that statically, in which case this issue might not be needed.

Solution

Alternatives

No response

mshakeg commented 1 year ago

It would also be useful to cache the results for many JSON RPC method requests on CDNs that return static(unchanging) results such as the following calls:

This would be especially helpful for calls made by the graph node such as eth_getBlockByNumber and eth_getTransactionByHash. Say you have multiple graph nodes located in the same cloud region polling for the same blocks and transactions that are being served by the same CDN, you'd affectively only have 1 call(i.e. the very first) go through to the relay service with all subsequent calls made by other graph nodes being returned from the CDN cache.

Nana-EC commented 1 year ago

Thanks @mshakeg We'll have to investigate the CDN side. I think these are good solutions as they would indeed reduce throttle hits for static calls.

For many of the methods you pointed out we do have app level caching that saves the relay work. I guess your request here is also to move this to the CDN level.

mshakeg commented 1 year ago

Hi @Nana-EC thanks for the reply.

I guess your request here is also to move this to the CDN level.

Exactly, though as previously mentioned I'm not sure if CDNs(or the particular CDN that hashio uses) support this particular type of caching i.e. only considering specific fields in the request body and ignoring others which in this context entails ignoring the id field as that's essentially a trace field and has no bearing on the core response.

Btw not sure if GitHub notifies you for comments that tag you on closed issues, but I tried reaching out to you here, in any case, I don't think either proposed solution would be feasible within the next week. I did elaborate on the 2nd idea here and I'd appreciate any feedback.

Nana-EC commented 1 year ago

Hi @Nana-EC thanks for the reply.

I guess your request here is also to move this to the CDN level.

Exactly, though as previously mentioned I'm not sure if CDNs(or the particular CDN that hashio uses) support this particular type of caching i.e. only considering specific fields in the request body and ignoring others which in this context entails ignoring the id field as that's essentially a trace field and has no bearing on the core response.

Btw not sure if GitHub notifies you for comments that tag you on closed issues, but I tried reaching out to you here, in any case, I don't think either proposed solution would be feasible within the next week. I did elaborate on the 2nd idea here and I'd appreciate any feedback.

Yeah i didn't see the notification for the close item. We're looking at your discussion topic. Expect some comments soon if you haven't had some already

Nana-EC commented 1 year ago

Hey @mshakeg wanted to circle back on this. We did investigation into this and the challenge as you'd also explored is mostly around JSON RPC APIs and the challenge in caching their requests as most CDNs used the request path of HTTP GETs We're exploring open source solutions that could help with this.

Notably are you still experiencing high throttling? Ethers now has a v6 which we've confirmed support for so hopefully they've also improved their polling logic.

mshakeg commented 1 year ago

Hey @Nana-EC

The issue isn't limited to the JSON RPC methods being POSTs(this may not even be an issue as I believe CloudFront can be configured to cache POSTs), the bigger issue is with the id field in the request body as described in the issue description.

I'm not sure how well those open source solutions y'all are considering would resolve the id issue, but I can think of one solution that should probably be optionally enabled as it'll likely result in increased infra costs, however with the benefit of reduced latency. The solution is as follows:

Implement the following 3 Lambda@Edge functions triggered by 3/4 cloudfront events:

  1. Viewer Request:
  1. Origin Response(optional as the following can be done directly in the relay server):
  1. Viewer Response:
Nana-EC commented 1 year ago

Thanks @mshakeg . Great consideration. This might be something for HashIO or other hosted solutions. However, we try to go for open source solutions so that it's available to all relay operators in any cloud host solution.