casper-network / casper-sidecar

Apache License 2.0
2 stars 8 forks source link

Casper Event Sidecar README

Summary of Purpose

The Casper Event Sidecar is an application that runs in tandem with the node process. This reduces the load on the node process by allowing subscribers to monitor the event stream through the Sidecar while the node focuses entirely on the blockchain. Users needing access to the JSON-RPC will still need to query the node directly.

While the primary use case for the Sidecar application is running alongside the node on the same machine, it can be run remotely if necessary.

System Components & Architecture

Sidecar Diagram

Casper Nodes offer a Node Event Stream API returning Server-Sent Events (SSEs) that hold JSON-encoded data. The SSE Sidecar uses this API to achieve the following goals:

The SSE Sidecar uses one ring buffer for outbound events, providing some robustness against unintended subscriber disconnects. If a disconnected subscriber re-subscribes before the buffer moves past their last received event, there will be no gap in the event history if they use the start_from URL query.

Prerequisites

Configuration

The SSE Sidecar service must be configured using a .toml file specified at runtime.

This repository contains several sample configuration files that can be used as examples and adjusted according to your scenario:

Once you create the configuration file and are ready to run the Sidecar service, you must provide the configuration as an argument using the -- --path-to-config option as described here.

Node Connections

The Sidecar can connect to Casper nodes with versions greater or equal to 1.5.2.

The node_connections option configures the node (or multiple nodes) to which the Sidecar will connect and the parameters under which it will operate with that node. Connecting to multiple nodes requires multiple [[connections]] sections.

[[connections]]
ip_address = "127.0.0.1"
sse_port = 18101
rest_port = 14101
max_attempts = 10
delay_between_retries_in_seconds = 5
allow_partial_connection = false
enable_logging = true
connection_timeout_in_seconds = 3
no_message_timeout_in_seconds = 60
sleep_between_keep_alive_checks_in_seconds = 30

[[connections]]
ip_address = "127.0.0.1"
sse_port = 18102
rest_port = 14102
max_attempts = 10
delay_between_retries_in_seconds = 5
allow_partial_connection = false
enable_logging = false
connection_timeout_in_seconds = 3
no_message_timeout_in_seconds = 60
sleep_between_keep_alive_checks_in_seconds = 30

[[connections]]
ip_address = "127.0.0.1"
sse_port = 18103
rest_port = 14103
max_attempts = 10
delay_between_retries_in_seconds = 5
allow_partial_connection = false
enable_logging = false
connection_timeout_in_seconds = 3
no_message_timeout_in_seconds = 60
sleep_between_keep_alive_checks_in_seconds = 30

Storage

This directory stores the SSE cache and an SQLite database if the Sidecar is configured to use SQLite.

[storage]
storage_path = "./target/storage"

Database Connectivity

The Sidecar can connect to different types of databases. The current options are SQLite or PostgreSQL. The following sections show how to configure the database connection for one of these DBs. Note that the Sidecar can only connect to one DB at a time.

SQLite Database

This section includes configurations for the SQLite database.

[storage.sqlite_config]
file_name = "sqlite_database.db3"
max_connections_in_pool = 100
# https://www.sqlite.org/compile.html#default_wal_autocheckpoint
wal_autocheckpointing_interval = 1000

PostgreSQL Database

The properties listed below are elements of the PostgreSQL database connection that can be configured for the Sidecar.

To run the Sidecar with PostgreSQL, you can set the following database environment variables to control how the Sidecar connects to the database. This is the suggested method to set the connection information for the PostgreSQL database.

SIDECAR_POSTGRES_USERNAME="your username"
SIDECAR_POSTGRES_PASSWORD="your password"
SIDECAR_POSTGRES_DATABASE_NAME="your database name"
SIDECAR_POSTGRES_HOST="your host"
SIDECAR_POSTGRES_MAX_CONNECTIONS="max connections"
SIDECAR_POSTGRES_PORT="port"

However, DB connectivity can also be configured using the Sidecar configuration file.

If the DB environment variables and the Sidecar's configuration file have the same variable set, the DB environment variables will take precedence.

It is possible to completely omit the PostgreSQL configuration from the Sidecar's configuration file. In this case, the Sidecar will attempt to connect to the PostgreSQL using the database environment variables or use some default values for non-critical variables.

[storage.postgresql_config]
database_name = "event_sidecar"
host = "localhost"
database_password = "p@$$w0rd"
database_username = "postgres"
max_connections_in_pool = 30

Rest & Event Stream Criteria

This information determines outbound connection criteria for the Sidecar's rest_server.

[rest_server]
port = 18888
max_concurrent_requests = 50
max_requests_per_second = 50
request_timeout_in_seconds = 10
[event_stream_server]
port = 19999
max_concurrent_subscribers = 100
event_stream_buffer_length = 5000

The event_stream_server section specifies a port for the Sidecar's event stream.

Additionally, there are the following two options:

Admin Server

This optional section configures the Sidecar's administrative server. If this section is not specified, the Sidecar will not start an admin server.

[admin_server]
port = 18887
max_concurrent_requests = 1
max_requests_per_second = 1

Access the admin server at http://localhost:18887/metrics/.

Swagger Documentation

Once the Sidecar is running, access the Swagger documentation at http://localhost:18888/swagger-ui/. You need to replace localhost with the IP address of the machine running the Sidecar application if you are running the Sidecar remotely. The Swagger documentation will allow you to test the REST API.

OpenAPI Specification

An OpenAPI schema is available at http://localhost:18888/api-doc.json/. You need to replace localhost with the IP address of the machine running the Sidecar application if you are running the Sidecar remotely.

Unit Testing the Sidecar

You can run the unit and integration tests included in this repository with the following command:

cargo test

You can also run the performance tests using the following command:

cargo test -- --include-ignored

The EXAMPLE_NCTL_CONFIG.toml file contains the configurations used for these tests.

Running the Sidecar

After creating the configuration file, run the Sidecar using Cargo and point to the configuration file using the --path-to-config option, as shown below. The command needs to run with root privileges.

sudo cargo run -- --path-to-config EXAMPLE_NODE_CONFIG.toml

The Sidecar application leverages tracing, which can be controlled by setting the RUST_LOG environment variable.

The following command will run the sidecar application with the INFO log level.

RUST_LOG=info cargo run -p casper-event-sidecar -- --path-to-config EXAMPLE_NCTL_CONFIG.toml

The log levels, listed in order of increasing verbosity, are:

Further details about log levels can be found here.

Testing the Sidecar using NCTL

The Sidecar application can be tested against live Casper nodes or a local NCTL network.

The configuration shown within this README will direct the Sidecar application to a locally hosted NCTL network if one is running. The Sidecar should function the same way it would with a live node, displaying events as they occur in the local NCTL network.

Troubleshooting Tips

This section covers helpful tips when troubleshooting the Sidecar service. Replace the URL and ports provided in the examples as appropriate.

Checking liveness

To check whether the Sidecar is running, run the following curl command, which returns the newest stored block.

curl http://SIDECAR_URL:SIDECAR_REST_PORT/block

Each block should have a .block.header.timestamp field. Even if there were no deploys, a block should be produced every 30-60 seconds. If the latest block falls behind, it means there is an issue with the Sidecar reading events from the node. Here is a helpful script provided jq is installed:

curl http://SIDECAR_URL:SIDECAR_REST_PORT/block | jq '.block.header.timestamp'

Checking the node connection

Checking the node connection status requires the admin server to be enabled, as shown here. Use this curl command and observe the output:

curl http://SIDECAR_URL:SIDECAR_ADMIN_PORT/metrics

Sample output:

# HELP node_statuses Current status of node to which sidecar is connected. Numbers mean: 0 - preparing; 1 - connecting; 2 - connected; 3 - reconnecting; -1 - defunct -> used up all connection attempts ; -2 - defunct -> node is in an incompatible version
# TYPE node_statuses gauge
node_statuses{node="35.180.42.211:9999"} 2
node_statuses{node="69.197.42.27:9999"} 2

In the above node_statuses, you can see which nodes are connecting, which are already connected, which are disconnected due to no more retries, etc. The number next to each node represents the connection status:

Diagnosing errors

To diagnose errors, look for error logs and check the error_counts on the metrics page, http://SIDECAR_URL:SIDECAR_ADMIN_PORT/metrics, where most of the errors related to data flow will be stored:

# HELP error_counts Error counts
# TYPE error_counts counter
error_counts{category="connection_manager",description="fetching_from_stream_failed"} 6

Monitoring memory consumption

To monitor the Sidecar's memory consumption, observe the metrics page, http://SIDECAR_URL:SIDECAR_ADMIN_PORT/metrics. Search for process_resident_memory_bytes:

# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 292110336

If memory consumption is high without an apparent reason, please inform the Sidecar team by creating an issue in GitHub.

Remember to check the event_stream_buffer_length setting in the configuration because it dramatically impacts how much memory the Sidecar consumes. Also, some events, like step events, consume more memory.

Ensuring sufficient storage

Ensuring enough space in the database is essential for the Sidecar to consume events produced from the nodes' SSE streams over a more extended period. Each event is written to the database in a raw format for future processing. Running the Sidecar for an extended period (weeks or months) can result in storing multiple Gigabytes of data. If the database runs out of space, the Sidecar will lose events, as it cannot record them.

Inspecting the REST API

The easiest way to inspect the Sidecar’s REST API is with Swagger.

Limiting concurrent requests

The Sidecar can be configured to limit concurrent requests (max_concurrent_requests) and requests per second (max_requests_per_second) for the REST and admin servers.

However, remember that those are application-level guards, meaning that the operating system already accepted the connection, which used up the operating system's resources. Limiting potential DDoS attacks requires consideration before the requests are directed to the Sidecar application.