Meta Issue: Fixing high impact correctness and performance problems in ETH RPC API for snapshot synced nodes

aarshkshah1992 commented 1 month ago

This issue aims to be a meta-issue to capture and track work that needs to be done to enhance correctness, performance, and stability of the ETH RPC API on snapshot synced nodes. Note that improving performance for ETH RPC API on archival nodes is out of scope for this issue and will be addressed by a future issue.

Our goal is to improve the developer experience (DX) for key partners, including:

ETH Subgraph Providers (e.g., Vulcanize)
EVM Explorers (e.g., Blockscout)
Cross-chain Bridges (e.g., Axelar)
DApp Developers using existing ETH tooling which relies on ETH RPC API

Correctness and data availability issues in the chain state Indexes used by the ETH RPC API

Currently, we maintain three primary indices on the chain state, which are essential for both correctness and performance of multiple ETH RPC APIs.

Transaction Index

Maps an ETH transaction hash to a Filecoin message CID.
This mapping is one-way; if it's not in the index, there's no way to go from an ETH transaction hash to a Filecoin message CID.
APIs depend on this index to lookup the Filecoin message CID for a given ETH transaction hash, and then lookup the message by CID to get the actual transaction data.
If an entry is not found in this Index for a given transaction hash, the subsequent lookup for the message will be slow and likely fail if the query pertains to an ETH transaction.

Message Index

Maps a Message("transaction") to it's tipset and block number
Heavily used by ETH RPC to lookup the tipset containing a given message CID to get the full message/receipts
A miss on this Index leads to a painfully slow search through the state store for the requested message

Event Index

Associates each tipset/block with the event logs for all transactions contained within.
This index is crucial for the functionality of the ETH Event APIs, which are extensively utilized by subgraph providers and bridges.
Issues with data availability in this index can lead to costly recomputation of tipsets to regenerate events or result in the Events API failing to return events that should be available—assuming the corresponding tipset is present in the chain state, these events should be reliably indexed.

All of the above indices suffer from some or all of the following problems that need to be fixed:

They're not reset and rehydrated when a node syncs from a snapshot
They're not automatically backfilled on node startup
The lotus-shed backfilling CLI that users rely on for manually backfilling the indices is broken as all the Indices are persisted in Sqlite and Sqllite only supports a single writer. This effectively means that backfilling races with indexing new/ongoing state transitions
Indexing is done asynchronously to tipset/message execution but APIs that rely on these indices do not account for the async nature of indexing which leads to racy data avability issues for lookups at/near the chain head
They are not subject to garbage collection when the Splitstore is GC'd, which means they retain data that will never be accessed and the indices keep growing in size. Although the space used by these indices is minimal compared to the overall chainstate, the accumulation of unnecessary data could potentially slow down index queries.
Despite being operational for over a year, features like the Message Index are still labeled as "EXPERIMENTAL" which cause confusion among Node operators.
Better logging and telemetry around index usage and backfilling.
Consistency checks and auto-repair mechanisms in the Indices to 1) Flag missing data in the indices for epochs that have achieved finality and 2) Lazy backfilling of missing data in the indices on user requests.

Correctness problems in the ETH Events API

12111 - Event Filter APIs have raciness that can return incorrect results.
10911 - Mismatch between the block hash returned by ETH Get Block API and the block hash returned by the ETH Events API. This one could have been caused by a re-org but a solid itest to verify that this is no longer a problem would be great.
11589 AND #11153 Event Filter APIs should work with the HTTP Gateway as expected by ETH tooling.
10940 - eth_getLogs should differentiate between "processed the block it has no events" vs "never seen this block" errors. We already have the required scaffolding and metdata for this in place but need to fix the error handling here and write some solid tests

In-memory block caching for perf improvements

Multiple ETH RPC APIs frequently need to lookup Filecoin Tipsets and convert them to the correspondong Ethereum block representations. These lookups are performed on the chainstore which is expensive. We should cache these tipsets/blocks in an LRU cache. See #10520.

Miscellaneous correctness bugs from the backlog

10909 -eth_getBlock does not confirm to ETH RPC spec for Filecoin null rounds (null rounds are a quirk in Filecoin and need to be handled correctly here).
11635 - The ETH Trace API currently fails to include the byte code of the deployed smart contract in the trace output for transactions that deploy smart contracts. IIRC, Blockscout really needed this to be able to show the contract byte code on their explorer.
10357 - Correctness bug in eth_getTransactionCount.

snissn commented 1 month ago

For any and all of these that have ways to reproduce it would be great to coordinate and add the RPC call behind any of these issues to the RPC benchmark tool that Fil-B has been maintaining - https://github.com/fil-builders/benchmark-rpc/blob/main/pages/index.js#L21

For example this issue https://github.com/filecoin-project/lotus/issues/10940 should be easy to reproduce in a live test. I created a ticket for it https://github.com/FIL-Builders/benchmark-rpc/issues/1 to add it to the web app http://benchmark-rpc.fil.builders/

rvagg commented 1 month ago

They're not reset and rehydrated when a node syncs from a snapshot

msgindex is @ https://github.com/filecoin-project/lotus/blob/718fc0330f313a37d5deb73e6e05f0fac2c9b772/cmd/lotus/daemon.go#L647

It at least has a pattern we can follow for others. But it also overlaps with a backfill operation, so we may end up taking care of snapshot import with a general backfill routine if we get that right.

Stebalien commented 1 month ago

This list seems pretty complete. IMO, the highest priority is fixing the indexing issues:

P0: Make sure that we backfill to the last indexed tipset on restart. We should never have "holes".
P1: Make sure that we wait for the index in some cases, or even eagerly force it. E.g., as we discussed on the call, all of the EthGetBlockBy* commands should force the indexer to index that block (and its parents).

Indexing is done asynchronously to tipset/message execution but APIs that rely on these indices do not account for the async nature of indexing which leads to racy data avability issues for lookups at/near the chain head

IMO, the best way to handle this case is the dance we discussed on the call:

Check the index. If present, return.
Wait for the index to reach the current head.
Check again.

This will miss uncles, but StateSearchMsg is designed to only find messages on the main chain.

I took a look at how geth handles stuff like this and... they also appear to index asynchronously and handle this case by returning an error if the node is currently indexing a block. That's not a terrible option... but it would be a larger breaking change.

BigLep commented 1 month ago

This is a great overview @aarshkshah1992 - thanks for writing it up. A few questions, some of which are coming from a newbie/ignorant-of-the-code perspective. I'm happy to chat on any of these elsewhere or offline, but figured to ask here so it's public.

Can any of the the ETH RPC code simplify when we have fast finality (F3)? If so, given we’re so close to that being live on the network (less than 2 months), should we do work here with that assumption in mind?
Assuming we "make sure that we backfill to the last indexed tipset on restart so we never have holes" and we "make sure that we wait for the index" can we simplify the code failing fast if we have an index miss? I'm eyeing your statements in the issue description about “the subsequent lookup for the message will be slow and likely fail if the query pertains to an ETH transaction” and “A miss on this Index leads to a painfully slow search through the state store for the requested message”. It's presumably not good to be in this state, but would it better off to fail fast then do expensive state traversals? (I assume private Lotus RPC methods can be invoked if someone needs the slow path.)
Is there value in having separate db's? (I don't have insight here - just curious.). I see there is an issue about consolidating in https://github.com/filecoin-project/lotus/issues/10807 . (Maybe it's best to discuss this question there.)

aarshkshah1992 commented 1 month ago

@BigLep

I don't think there's any work here that's bound to a certain notion of finality and so not sure if F3 changes anything here in terms of the work we need to do on ETH RPC/Chain state indexing.
I don't think there is any value in having separate DBs but @Stebalien can confirm the team's line of thinking when this was implemented.

Let me look into 2.

Stebalien commented 1 month ago

I don't think there's any work here that's bound to a certain notion of finality and so not sure if F3 changes anything here in terms of the work we need to do on ETH RPC/Chain state indexing.

That depends on what we do with the API. If F3 is "fast enough", we could just not expose anything after finality. But... that's probably not going to work well.

I don't think there is any value in having separate DBs but @Stebalien can confirm the team's line of thinking when this was implemented.

I agree there's no reason to keep them separate.

aarshkshah1992 commented 1 month ago

@Stebalien @BigLep @rvagg

In the first pass of this work, we're not going to work on merging the DBs for these as that is a larger refactor and will need a non-trivial migration for users and we've not estimated it yet.

Let's get to it once we've fixed all the other problems here.

BigLep commented 3 weeks ago

In the first pass of this work, we're not going to work on merging the DBs for these as that is a larger refactor and will need a non-trivial migration for users and we've not estimated it yet.

Let's get to it once we've fixed all the other problems here.

For visibility, it was decided that it would be useful to merge the DBs into a single DB. The work is happening in https://github.com/filecoin-project/lotus/pull/12421

filecoin-project / lotus