Closed peterargue closed 3 weeks ago
Fetching (update)
Errors are fetched and stored from execution nodes as a parallel step in the ingestion engine, which is based on executionReceipts. All errors for the block can be requested at once through the GetTransactionErrorMessagesByBlockID API call.
Problem Description
Currently Access nodes with execution data indexing enabled will cache transaction result error messages from ENs in an in-memory cache. This means if the node ever restarts, it will need to re-request those messages on subsequent lookups. This is fine for the active network since there will always be several Execution nodes available. On historic networks, we'd like to stop running execution nodes entirely to save on costs.
Proposed Solution
Add a CLI flag that when enabled, the node will store the transaction error messages into the node's protocol database. The node would then check its local db first before requesting from execution nodes during API requests.
When this option is enabled, the node should pre-fetch the messages during the indexing process instead of waiting for an API request. This will ensure that messages are available for all transactions, which is required for historic mode.
Database
The data should be stored as a new data type to the protocol db (which is badger at this point). They will need:
Store
,ByBlockID
,ByBlockIDTransactionID
,ByBlockIDTransactionIndex
methods (similar to theTransactionResult
storage object)At a minimum, we should store the following for each entry:
Error
-> error message stringExecutorID
-> node ID of the execution node that the message was received fromUse
TransactionResult
as a model for how to handle the db operations.Entries should only be stored for transactions that failed with an error.
Indexing
Errors are fetched from execution nodes as a parallel indexing step. All errors for the block can be requested at once.
When querying execution nodes, respect the "fixed" and "preferred" node lists. If querying a node fails, continue to the next node. Stop after the first successful response.
If no execution nodes return a valid response, store a static message "failed", with
flow.ZeroID
as the executor. We can add functionality to backfill these later, but shouldn't block indexing or cause the node to crash. These cases should be handled by the API by requerying ENs when encountered.Definition of Done