Closed juliangruber closed 1 month ago
For anyone else having this issue, https://github.com/filecoin-station/on-contract-event/tree/main is a temporary workaround
Thanks to @dumikau for finding this code path in lotus-gateway
, which is most likely the problem. When connecting to lotus directly, everything works as expected.
/* FILTERS: Those are stateful.. figure out how to properly either bind them to users, or time out? */
func (gw *Node) EthGetFilterChanges(ctx context.Context, id ethtypes.EthFilterID) (*ethtypes.EthFilterResult, error) {
if err := gw.limit(ctx, stateRateLimitTokens); err != nil {
return nil, err
}
ft := statefulCallFromContext(ctx)
ft.lk.Lock()
_, ok := ft.userFilters[id]
ft.lk.Unlock()
if !ok {
return nil, filter.ErrFilterNotFound
}
return gw.target.EthGetFilterChanges(ctx, id)
}
Every HTTP request gets its own new statefulCallTracker
, which is apparently by design and intended only for websocket connections: https://github.com/filecoin-project/lotus/blob/1b2dde1e65b030975714e06fd792161e7b55a979/gateway/proxy_eth.go#L647-L648
The problem is that filters are long-lived inside a Lotus node and it's perfectly valid to do this via non-websocket requests.
It seems to me that the desire here is to partition the filter and subscription space per-user, but that's not really possible to achieve with the way this all works.
However, filter IDs are generated via UUIDv4, so we have some guarantees about uniqueness and guess-ability already. I'm not sure what other leakage we would try and protect against in a public gateway. So, we could either share a statefulCallTracker
across all requests, or just do away with it entirely since it just proxies to the original calls which do essentially the same map look-up operation.
@magik6k am I missing something from 22231dc34f and 1286d76988? Is there a reason I'm missing that we can't just pass these through without checking?
FWIW, it's easy to configure Ethers v6 ethers.JsonRpcProvider
to use the old polling-based approach that uses the well-supported RPC method eth_getLogs
:
const provider = new ethers.JsonRpcProvider(fetchRequest, undefined, {
polling: true
})
IMO the action item here is to remove the stateful call tracker from this call path and just pass it through to the node; I don't see a good reason it's gated.
Looking at this again; the tracking was originally introduced in https://github.com/filecoin-project/lotus/pull/9863, and then extended in https://github.com/filecoin-project/lotus/pull/10027 to cover subscribe.
userFilters
is only used to track the number of filters applied per connection. EthMaxFiltersPerConn
is fixed to 16
, and when the number of filters reaches this number for a particular connection then they'll be rejected.userSubscriptions
is only used to track the number of Subscribe
calls and also check it against EthMaxFiltersPerConn
.It seems to me that the desire here is to partition the filter and subscription space per-user
My original comment from above is wrong. The purpose of these checks is to limit the number of filters installed on a lotus node for each "user", which is an appropriate thing for a gateway to do because of the cost of having active filters.
This works find when using websockets, but we currently don't have any per-IP tracking, and even if we did we'd have to deal with people using reverse proxies in front of lotus-gateway (like glif does). We're then in the realm of deciding whether to accept X-Forwarded-For
or not (fine if you have a reverse proxy, dangerous if you don't). We can't give cookies because people are using this from curl or libraries that don't support cookies (making an assumption here about ethers).
It seems like glif doesn't expose websockets, but api.chain.love does, so this ~works (at leas it doesn't error, I don't know an address to use to get something more active):
import { ethers } from 'ethers'
const provider = new ethers.WebSocketProvider('wss://api.chain.love/rpc/v1')
console.log('provider:', provider)
const filterId = await provider.send('eth_newFilter', [{
address: ['0x811765acce724cd5582984cb35f5de02d587ca12'],
topics: []
}])
console.log('filterId:', filterId)
provider.on('block', async() => {
const logs = await provider.send('eth_getFilterChanges', [filterId])
console.log('logs:', logs)
})
I think that we might be forced to just block these stateful API endpoints from HTTP like suggested in https://github.com/filecoin-project/lotus/issues/11153 unless we want to go down the rabbit hole of per-IP tracking. We could also be encouraging public API providers to offer websockets option.
I'd really like to know how this is handled in Ethereum-land. How do public providers offer this normally?
I was thinking that something like the Arbitrum option gets us around the limit problems with this. We get rid of the per-connection limit entirely but setup a liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.
I wouldn't mind offering more options for public API providers, but this is something we could evolve over time. And already now they have the option of excluding these APIs from what they offer with a reverse proxy and they could even do API key gating too.
liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.
Alas we already have that with FilterTTL
in the lotus node itself, which defaults to 24 hours. We probably want to document that this should be reduced dramatically for multi-tenant nodes.
After some discussion on Slack I think that the way forward here is to:
FilterTTL
and MaxFilters
This should be resolved in https://github.com/filecoin-project/lotus/pull/12327
Closing as completed as this should be resolved in https://github.com/filecoin-project/lotus/pull/12327, which has been shipped in Lotus v1.29.0 which most RPC-providers has updated to now. Please reopen if you still encounter this issue @juliangruber
Checklist
Latest release
, the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.Lotus component
Lotus Version
Repro Steps
Describe the Bug
After upgrading to
ethers@6
, it's now failing to subscribe to events. See repro steps above. It responds with"filter not found"
although theid
returned frometh_newFilter
was used.Logging Information
This was on glif. Same results on chain.love.
I tried reproducing locally, but failed on this:
I did already set
EnableEthRPC = true