filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.85k stars 1.27k forks source link

`eth_getFilterChanges` returns `"filter not found"` #11589

Closed juliangruber closed 1 month ago

juliangruber commented 9 months ago

Checklist

Lotus component

Lotus Version

lotus deployed to glif

Repro Steps

$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_newFilter","params":[{"topics":["0x2e84339036b9caef6da03565dd37a42d041d8af759ccfddc01625856146ce473"],"addresses":["0x811765acce724cd5582984cb35f5de02d587ca12"]}]}'
{"jsonrpc":"2.0","result":"0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000","id":1}
$ sleep 10 # `sleep 0` and `sleep 5` also don't work
$ curl https://api.node.glif.io/rpc/v0 -d'{"jsonrpc":"2.0","id":1,"method":"eth_getFilterChanges","params":["0x43baae26e5514378adc824ca03b261c100000000000000000000000000000000"]}'
{"jsonrpc":"2.0","id":1,"error":{"code":1,"message":"filter not found"}}

Describe the Bug

After upgrading to ethers@6, it's now failing to subscribe to events. See repro steps above. It responds with "filter not found" although the id returned from eth_newFilter was used.

Logging Information

This was on glif. Same results on chain.love.

I tried reproducing locally, but failed on this:

{"jsonrpc":"2.0","id":1,"error":{"code":-32601,"message":"method 'eth_newFilter' not found"}}

I did already set EnableEthRPC = true

juliangruber commented 9 months ago

For anyone else having this issue, https://github.com/filecoin-station/on-contract-event/tree/main is a temporary workaround

juliangruber commented 8 months ago

Thanks to @dumikau for finding this code path in lotus-gateway, which is most likely the problem. When connecting to lotus directly, everything works as expected.

/* FILTERS: Those are stateful.. figure out how to properly either bind them to users, or time out? */

func (gw *Node) EthGetFilterChanges(ctx context.Context, id ethtypes.EthFilterID) (*ethtypes.EthFilterResult, error) {
    if err := gw.limit(ctx, stateRateLimitTokens); err != nil {
        return nil, err
    }

    ft := statefulCallFromContext(ctx)
    ft.lk.Lock()
    _, ok := ft.userFilters[id]
    ft.lk.Unlock()

    if !ok {
        return nil, filter.ErrFilterNotFound
    }

    return gw.target.EthGetFilterChanges(ctx, id)
}
rvagg commented 6 months ago

https://github.com/filecoin-project/lotus/blob/1b2dde1e65b030975714e06fd792161e7b55a979/gateway/handler.go#L89-L96

Every HTTP request gets its own new statefulCallTracker, which is apparently by design and intended only for websocket connections: https://github.com/filecoin-project/lotus/blob/1b2dde1e65b030975714e06fd792161e7b55a979/gateway/proxy_eth.go#L647-L648

The problem is that filters are long-lived inside a Lotus node and it's perfectly valid to do this via non-websocket requests.

It seems to me that the desire here is to partition the filter and subscription space per-user, but that's not really possible to achieve with the way this all works.

However, filter IDs are generated via UUIDv4, so we have some guarantees about uniqueness and guess-ability already. I'm not sure what other leakage we would try and protect against in a public gateway. So, we could either share a statefulCallTracker across all requests, or just do away with it entirely since it just proxies to the original calls which do essentially the same map look-up operation.

@magik6k am I missing something from 22231dc34f and 1286d76988? Is there a reason I'm missing that we can't just pass these through without checking?

bajtos commented 5 months ago

FWIW, it's easy to configure Ethers v6 ethers.JsonRpcProvider to use the old polling-based approach that uses the well-supported RPC method eth_getLogs:


const provider = new ethers.JsonRpcProvider(fetchRequest, undefined, {
  polling: true
})
rvagg commented 5 months ago

IMO the action item here is to remove the stateful call tracker from this call path and just pass it through to the node; I don't see a good reason it's gated.

rvagg commented 3 months ago

Looking at this again; the tracking was originally introduced in https://github.com/filecoin-project/lotus/pull/9863, and then extended in https://github.com/filecoin-project/lotus/pull/10027 to cover subscribe.

It seems to me that the desire here is to partition the filter and subscription space per-user

My original comment from above is wrong. The purpose of these checks is to limit the number of filters installed on a lotus node for each "user", which is an appropriate thing for a gateway to do because of the cost of having active filters.

This works find when using websockets, but we currently don't have any per-IP tracking, and even if we did we'd have to deal with people using reverse proxies in front of lotus-gateway (like glif does). We're then in the realm of deciding whether to accept X-Forwarded-For or not (fine if you have a reverse proxy, dangerous if you don't). We can't give cookies because people are using this from curl or libraries that don't support cookies (making an assumption here about ethers).

It seems like glif doesn't expose websockets, but api.chain.love does, so this ~works (at leas it doesn't error, I don't know an address to use to get something more active):

import { ethers } from 'ethers'

const provider = new ethers.WebSocketProvider('wss://api.chain.love/rpc/v1')

console.log('provider:', provider)

const filterId = await provider.send('eth_newFilter', [{
    address: ['0x811765acce724cd5582984cb35f5de02d587ca12'],
    topics: []
}])

console.log('filterId:', filterId)

provider.on('block', async() => {
    const logs = await provider.send('eth_getFilterChanges', [filterId])
    console.log('logs:', logs)
})

I think that we might be forced to just block these stateful API endpoints from HTTP like suggested in https://github.com/filecoin-project/lotus/issues/11153 unless we want to go down the rabbit hole of per-IP tracking. We could also be encouraging public API providers to offer websockets option.

I'd really like to know how this is handled in Ethereum-land. How do public providers offer this normally?

rvagg commented 3 months ago

I was thinking that something like the Arbitrum option gets us around the limit problems with this. We get rid of the per-connection limit entirely but setup a liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.

I wouldn't mind offering more options for public API providers, but this is something we could evolve over time. And already now they have the option of excluding these APIs from what they offer with a reverse proxy and they could even do API key gating too.

rvagg commented 3 months ago

liveness check in the gateway that will automatically remove the filter from the lotus node if it's not polled after a certain period of time.

Alas we already have that with FilterTTL in the lotus node itself, which defaults to 24 hours. We probably want to document that this should be reduced dramatically for multi-tenant nodes.

rvagg commented 3 months ago

After some discussion on Slack I think that the way forward here is to:

rvagg commented 3 months ago

This should be resolved in https://github.com/filecoin-project/lotus/pull/12327

rjan90 commented 1 month ago

Closing as completed as this should be resolved in https://github.com/filecoin-project/lotus/pull/12327, which has been shipped in Lotus v1.29.0 which most RPC-providers has updated to now. Please reopen if you still encounter this issue @juliangruber