feat: add EVM indexer - Githubissues

Sekhmet commented 7 months ago

Summary

Closes: https://github.com/checkpoint-labs/checkpoint/issues/188 Closes: https://github.com/snapshot-labs/pitches/issues/29

To make indexers extractable it was separated into Indexer class which instance consumer initiaties. This allows better separation (no need for pulling deps for all networks) - everything currently is exported under either evm or starknet object, it can be extracted later (we might need to extract some common things first into other package). We also have specific types for writers for each network.

Those things are not handled right now:

Reorgs
Indexed event parameters probably don't work (we don't use it in our contracts, but better support should be looked at).

Test plan

Test with https://github.com/checkpoint-labs/checkpoint-template/tree/sekhmet/evm (link checkpoint and graphql).
Run query from below.
Compare events on sepolia etherscan: https://sepolia.etherscan.io/txs?a=0x4b4f7f64be813ccc66aefc3bfce2baa01188631c - no events should be missing.

query {
    proxies (orderBy: created_at_block, orderDirection: desc) {
        id
        implementation
        deployer
        tx_hash
        created_at
        created_at_block
    }
}

bonustrack commented 7 months ago

I'm trying to index another contract on Goerli to test but eth_getLogs request timeout after few minutes here is the error:

I've tried to run couple time but still endup with same error at the same block. The last_fetched_block is stuck at 6109472. Here is the branch: https://github.com/checkpoint-labs/checkpoint-template/tree/fabien/evm-poster Usually request timeout because there are too many events within the requested block range but I don't see any tx within the requested range, txs starting from the block 6623932: https://goerli.etherscan.io/txs?a=0x000000000000cd17345801aa8147b8d3950260ff&p=11 .

Also in _metadatas table the network identifier is not correct, showing starknet_5

Sekhmet commented 7 months ago

@bonustrack it seems that it's some issue on Infura side that it can never resolve initial request as we are not doing lots of requests and I even lowered the maximum range we are using to 10 blocks. I asked them about it on Discord.

bonustrack commented 7 months ago

I've updated to last changes and even if network id is fixed, it still update last_fetched_block by chunk of 100. It's also extremely slow, take about 10sec to update 100 blocks, tried with Alchemy and Infura got the same performance, with Ankr (doesn't work / resolve). I've tried to add "disable_checkpoints": true in the config file but it seem like it still query and store checkpoints and it fail at some point with this error:

Sekhmet commented 7 months ago

I've updated to last changes and even if network id is fixed, it still update last_fetched_block by chunk of 100.

It’s expected, I only modified the max step internally when handling requests.

I will checkout the rest but getLogs call is generally slow and takes 1s for each call. So generally we should do as little requests as possible, but this might increase likelihood of timeouts (however it seems that it can timeout regardless of range).

I assume that Anker either doesn’t support getLogs or fails differently to Infura if responses is too big (need to look into it).

I guess the issue will still be that we depend on slow getLogs call. Maybe we need better way of handling it - instead of fetching all logs we fetch only those that match tracked contracts - and if new contracts are added to sources on the fly we come back. Might solve the issue.

bonustrack commented 7 months ago

I guess the issue will still be that we depend on slow getLogs call. Maybe we need better way of handling it - instead of fetching all logs we fetch only those that match tracked contracts - and if new contracts are added to sources on the fly we come back. Might solve the issue.

Yes, I think this is the only way, if we can target specific contract it will be more efficient we wouldn't need to load each blocks events. It might just be a bit tricky to implement. If we prefetch events and find a new deployment event, we can't continue the prefetch process because we know 1 contract will be missing and will require to reset prefetched events. And the logic to discover new template contract address is part of a writer function, which in theory should run in the right order with the others writer functions. It sound like the solution would be to prefetch events until a new deployment is found then run indexer up to this event then continue the prefetch. But curious to know if you have some others ideas.

I think also that we will not find a perfect block range limit for the method eth_getEvent, I've used this in the past and it usually fail when there is too much events, which is something we can't predict, if there is 100 events within 2 blocks it may fail with 2 block range, while it may work for 10000 block range if there is just 10 event within this range. This also depend on node provider limits. I imagine this would require some kind of exponential backoff to deal with range.

Sekhmet commented 7 months ago

Yes, I think this is the only way, if we can target specific contract it will be more efficient we wouldn't need to load each blocks events. It might just be a bit tricky to implement. If we prefetch events and find a new deployment event, we can't continue the prefetch process because we know 1 contract will be missing and will require to reset prefetched events. And the logic to discover new template contract address is part of a writer function, which in theory should run in the right order with the others writer functions. It sound like the solution would be to prefetch events until a new deployment is found then run indexer up to this event then continue the prefetch. But curious to know if you have some others ideas.

I think this was considered as a one way of implementing it for Starknet before, but we decided to go other way - I can't find the discussion anymore to see what was the reasoning - ideally we should have consistent logic on all networks - at least to certain degree.

I think also that we will not find a perfect block range limit for the method eth_getEvent, I've used this in the past and it usually fail when there is too much events, which is something we can't predict, if there is 100 events within 2 blocks it may fail with 2 block range, while it may work for 10000 block range if there is just 10 event within this range. This also depend on node provider limits. I imagine this would require some kind of exponential backoff to deal with range.

Infura handles that and tells you new range to try and we handle that: https://github.com/checkpoint-labs/checkpoint/blob/81dd6bf5aeb9083f447d69711f1e0be0cba7d1c4/src/providers/evm/provider.ts#L267-L271

bonustrack commented 7 months ago

@Sekhmet I just found the discussion, it's here: https://discord.com/channels/1088202060283007137/1088202061176381492/1113515047277318214

Haven't read it yet, but I believe we can do such change on both Starknet and EVM chains.

Infura handles that and tells you new range to try and we handle that:

Ok, but if it's specific to Infura it wouldn't be ideal, we can't assume that everyone including us would use Infura

Sekhmet commented 7 months ago

I spent some time debugging timeouts and for some reason it very rarely happens when using curl (only been able to reproduce it while running checkpoint in background), but was able to reproduce it using simple script with node-fetch: https://gist.github.com/Sekhmet/a8f3bb693f51fb7fdf713ce5e6253ff7

This also happens when using address filter and topics filter, so I'm unsure if we can do something about it, I provided those PoCs to Infura for debugging.

bonustrack commented 7 months ago

If you like I can send you endpoint from others providers to see if this issue is just on Infura side or not. Even if this resolve fast I think syncing all events from a chain is never going to be fast, on the chain like Arbitrum where block time is small it will become even more problematic, we most likely would need to do the change we've discussed previously to sync only relevant contracts.

Sekhmet commented 1 month ago

@bonustrack would be great if we can review this, would be nice to have refactor land on master to avoid conflicts.

bonustrack commented 1 month ago

What do you mean by "Indexed event parameters probably don't work"? Is it about events coming from any contracts (not defined in config)?

bonustrack commented 1 month ago

It seem to works well, when I compared events I can see I have one extra event, 26 instead of 25, not sure why, this is the event that I get, it's the very first event detected by Checkpoint on this contract, not sure why it's missing on Etherscan:

{
  "id": "0x3fBc546BC7Fcf1e6eC7dAdfe6eBCf3c3ad2713ed",
  "implementation": "0xC3031A7d3326E47D49BfF9D374d74f364B29CE4D",
  "deployer": "0x556B14CbdA79A36dC33FcD461a04A5BCb5dC2A70",
  "tx_hash": "0xa97e863f7089dc5ee3852edaf911d5eb4098c8908ba3b9756c16164f1b5caf4d",
  "created_at": 1713349560,
  "created_at_block": 5717246
}

Sekhmet commented 1 month ago

@bonustrack it shows last 25 transactions by default, and it's 26th. https://sepolia.etherscan.io/txs?a=0x4b4f7f64be813ccc66aefc3bfce2baa01188631c

There are more transactions we just index them from specific block.

bonustrack commented 1 month ago

Ok actually I was looking here and seen only 25 events: https://sepolia.etherscan.io/address/0x4b4f7f64be813ccc66aefc3bfce2baa01188631c#events Do you say there is actually 147 txs and events on this contract but sync start at a specific block and it's correct if I have 26 proxies on my Checkpoint test instance?

Sekhmet commented 1 month ago

Yes, I didn't want to make it index everything, but we could. I was checking it by looking at all transactions in Etherscan starting at block 5717246 (the one we have configured) and making sure all of those are also present in Checkpoint.

checkpoint-labs / checkpoint

feat: add EVM indexer #274

Summary

Test plan