Syndica / sig

a Solana validator client implementation written in Zig
https://syndica.io/sig
Apache License 2.0
216 stars 34 forks source link

Snapshot Propagation #327

Open InKryption opened 3 weeks ago

InKryption commented 3 weeks ago

This issue can be closed once we are able to successfully and efficiently respond to requests for our snapshots through an RPC socket.

Right now, this is the basic outline for the steps to take:

InKryption commented 3 weeks ago

We are not going to specifically follow agave's implementation details, but to have a point of reference, the following is the current overview of how they accomplish this.

Agave References

Available Snapshot Info

The set of snapshots a node can/will provide is discovered via gossip.

Snapshot Packager Service

Main entry point

In SnapshotPackagerService::new, inside the thread it spawns, the SnapshotGossipManager is optionally created. If it is created, then during the loop it will have the latest snapshot package info pushed into it using push_snapshot_hash.

Snapshot Gossip Manager

Definition

SnapshotGossipManager holds an Arc to the ClusterInfo, and a single Optional instance of LatestSnapshotHashes (a full snapshot hash and an optional incremental snapshot hash). The push_snapshot_hash method replaces the currently tracked latest snapshot infos (full or incremental), and then calls push_latest_snapshot_hashes_to_cluster. The push_latest_snapshot_hashes_to_cluster method pushes the snapshot hashes into the cluster_info using its push_snapshot_hashes method.

Cluster Info

push_snapshot_hashes definition:

This method simply pushes the snapshot info as a SnapshotHashes message to the local pending messages queue (using push_message, just above it). Later on this will be flushed to the gossip.crds field (see flush_push_queue in the same file). Beyond this point, it's simply a matter of the gossip protocol sharing this info with peers, and through this, making the available snapshots known.

Transmitting Snapshots

After a new node identifies the most desirable snapshots via gossip, it can request to download them via the advertising node's RPC interface.

Json RPC Service

Main Entry Point

The JsonRpcService represents the RPC service thread, encapsulating a myriad of functionality related & adjacent to our focus.

After initializing all of its state in its new constructor, the thread configures and initializes the rpc server.

The first phase of configuration involves extending the MetaIoHandler in order to handle all the actual remote procedure calls; the second occurs during the th building of the server, during which the the RPC request middleware is added, and this is what we're interested in.

RPC Request Middleware

Definition

The main entry point is the on_request method, whose logic can be described in the following way:

  1. If the snapshot config field is non-null, and the request's URI path matches the FULL_SNAPSHOT_REQUEST_PATH or the INCREMENTAL_SNAPSHOT_REQUEST_PATH, redirect the request to the latest full or incremental snapshot, if available.

  2. If the request's URI path matches any of the routes for the bank's REST API, respond as is appropriate to the request.

  3. If the request's URI path matches the format of possibly available files as defined by is_file_get_path, call process_file_get.

  4. If the request's URI path is "/health", return the health check.

  5. Otherwise, simply return the request unmodified, to be handles by the regular RPC server procedures (down into the MetaIoHandler I assume).

Step 1 falls into step 3, which is defined by process_file_get, wherein the specified file is queued up to be served, which will either be the genesis file, or the specified snapshot - if the specified snapshot doesn't exist, the request simply resolves to a "not found" error response.