near / read-rpc

Read-only NEAR RPC centralized-like performant solution
17 stars 5 forks source link

feat: Store `TransactionDetails` in object storage (GCS) #247

Closed khorolets closed 3 months ago

khorolets commented 5 months ago

Rational

We noticed that the transaction_details table in ScyllaDB takes ~2 TB of data, directly affecting the infra cost. We want to reduce the cost even more. Building on our successful experience of serving some data directly from the NEAR Lake AWS S3 storage, we propose to move the TransactionDetails data to object storage (GSC) to further offload ScyllaDB.

Since this structure is stored in bytes (borsh-serialized), it doesn't make any big difference to the performance (though some increase in latency, anyway, is expected, not that significant to overpay).

What is done

  1. I've introduced a small crate (lib) tx-details-storage that interacts with the provided S3-compatible object storage to store and retrieve raw bytes
  2. Extended the configuration crate with the dedicated config section for tx-details-storage. Updated config.example.toml to reflect new accepted parameters
  3. Refactored tx-indexer to store the finished TransactionDetails to the object storage using the tx-details-storage library. The library doesn't handle serialization/deserialization to keep it as simple as possible, this is left for the indexer and rpc-server.
  4. Refactored rpc-server to retrieve TransactionDetails from the object storage using the newly introduced library.
  5. Additionally, I've extended the tx-indexer with a few metrics to monitor what's happening there.

Important note: Recently, we've adjusted the rpc-server to return half-baked (not finished, in progress) transaction details from the database cache table. This logic is preserved.

Next steps

We must wait to deliver this change. We have yet to get the data in an object storage.

I plan to make it in three phases:

  1. ✅ Create a small script that will walk over each transaction present in NEAR Protocol and copy the TransactionDetails from ScyllaDB to GCS. This has to be done for both testnet and mainnet
  2. Start the new tx-indexer (from this PR) to continue collecting the data into the object storage
  3. Replace the rpc-server instances with the new ones that can read from the object storage
  4. (cleanup phase) Stop old tx-indexers, drop transaction_details table from ScyllaDB, downscale ScyllaDB

Update from 2024-07-03

The table transaction_details is growing and reaching the limits of the database we have right now. We don't want to scale it, so we need to stop the growth in the short-term. I refactored code a bit by leaving the legacy way of searching for the transaction in the database (Scylla) while we migrate.

I've added an additional metric for this legacy_database_tx_details to monitor the migration period. We expect that counters to be null for some time before we can consider migration as finished.