filecoin-project / lotus

Reference implementation of the Filecoin protocol, written in Go
https://lotus.filecoin.io/
Other
2.84k stars 1.26k forks source link

Scalable Lotus Node Infrastructure with YugabyteDB #10630

Open snissn opened 1 year ago

snissn commented 1 year ago

Title: Scalable Lotus Node Infrastructure with YugabyteDB

Summary

This proposal aims to create a scalable Lotus infrastructure leveraging YugabyteDB as the backend datastore, supporting multiple RPC nodes behind a load balancer, and supporting linearly being able to increase read IOPs through horizontally scaling and adding more nodes to the Yugabyte cluster. Because of the Yugabyte cluster, all of the RPC nodes will see identical and synchronized chain state.

Proposal

  1. Integrate YugabyteDB as the backend datastore for Lotus nodes.

  2. Implement a new node mode called "follower." which will Support multiple RPC nodes behind a load balancer.

    • Link to GitHub Pull Request
    • [x] - Add a new node "mode" called "follower," where the node doesn't sync but follows the heads of other nodes.
    • [x] - Modify node/builder.go#L241-L246 to add isFollowerNode.
    • [x] - Don't create syncer / syncer services:
    • [x] - Modify node/builder_chain.go#L91-L92 and node/builder_chain.go#L174 to wrap the initialization in an if statement that does not initialize if the node is a follower node.
    • [x] - Don't subscribe to blocks/messages.
    • [x] - Connect to a shared chain blockstore with #10624 (YugabyteDB).
    • [x] - Write a small service replacing the syncer, get heads with ChainNotify from a "full" node RPC, and pass them to chainstore.PutTipSet:
    • [x] - Modify chain/store/store.go#L380.
    • [x] - write indices to remote server
    • [x] - read indices without writing them
    • [x] - caches
  3. Benchmark the new infrastructure to ensure performance improvements.

  1. Create documentation for setting up the proposed infrastructure.

Infrastructure Diagram


                    ┌───────────────┐
                    │ Load Balancer │
                    └──────┬────────┘
                           │
              ┌────────────┴────────────┐
              │                         │
┌─────────────┴───────┐      ┌────────────┴────────┐     ...   ┌─────────────┴───────┐
│RPC Node (Follower) 1│      │RPC Node (Follower) 2│           │RPC Node (Follower) N│
└─────┬───────────────┘      └─────┬───────────────┘           └─────┬───────────────┘
      │                              │                                 │
      └─────┐                  ┌─────┘                                 │
            └─────────┬────────┘
                      │
            ┌─────────┴────────┐
            │   YugabyteDB     │
            └─────────┬────────┘
                      │
            ┌─────────┴────────┐
            │   Lotus (Leader) │
            └──────────────────┘

image (4)

jennijuju commented 1 year ago

To close this issue we want to

f8-ptrk commented 1 year ago

is this anywhere close to being testable? we'd be happy to run on calib asap

snissn commented 1 year ago

is this anywhere close to being testable? we'd be happy to run on calib asap

yes! It should be very capable of running on calibnet at this point!! While it hasn't landed yet in master, there is the branch mikers/feat/cassandra-store for lotus and the documentation is available here https://cosmic-halva-918d7c.netlify.app/lotus/configure/followers/ ( to ultimately be merged into prod docs link when the code is merged)