openbitlab / srvcheck

Srvcheck helps you to monitor blockchain nodes
MIT License
12 stars 4 forks source link

SRVCHECK

CI Status License: MIT

Srvcheck helps you monitor blockchain nodes and be promptly informed about unexpected scenarios.

It supports these ecosystems:

It also supports all types of Celestia nodes:

It supports these notification outputs:

And it offers many features thanks to the following tasks:

Solana specific tasks:

Aptos specific tasks:

Tendermint specific tasks

Substrate specific tasks:

Near specific tasks:

Celestia Light and Full node specific tasks:

Celestia Validator node specific tasks:

We suggest adding the binary of the node to the PATH in order to benefit from all the monitoring features.

Telegram Bot Setup

In order to receive alerts on Telegram, you need to create a telegram bot and setup a new telegram group. The bot will send alerts there.
Start the @BotFather bot on Telegram, then type /newbot to create a new bot and specify name and username.
You should now have the token, which is required on a later step to install the monitor.
Then, open a new Telegram group, add the created bot to it.
You will need the chat id, and the easiest way to get it is to add @MissRose_bot to your group and then type /id in the chat group.
During the installation, you will use the id and token. These parameters will be flagged respectively and .

In order to differentiate channels for different notification severity, you can set the following fields in the configuration file:

infoLevelChatId =
warningLevelChatId =
errorLevelChatId =

Install & Update

curl -s https://raw.githubusercontent.com/openbitlab/srvcheck/main/install.sh | bash -s -- -t <tg_chat_id> <tg_token> -s <service_name> <optional_flags>

The install script can be customized with these flags (most of them are optional):

install --help
     --active-set <active_set_number> number of the validators in the active set (tendermint chain) [default is the number of active validators]
     --admin <@username> the admin telegram username that is interested to new governance proposals (tendermint)
 -a  --validator-address <address> enable checks on block production, governance proposals and other account related informations
 -b  --block-time <time> expected block time [default is 60 seconds]
     --branch <name> name of the branch to use for the installation [default is main]
     --endpoint <url:port> node local rpc address
     --git <git_api> git api to query the latest realease version installed
     --gov enable checks on new governance proposals (tendermint)
     --mount <mount_point> mount point where the node is installed
 -n  --name <name> monitor name [default is the server hostname]
     --rel <version> release version installed (required for tendermint chain if git_api is specified)   
     --signed-blocks <max_misses> <blocks_window> max number of blocks not signed in a specified blocks window [default is 5 blocks missed out of the latest 100 blocks]
 -s  --service <name> service name of the node to monitor [required]
 -t  --telegram <chat_id> <token> telegram chat options (id and token) where the alerts will be sent [required]
 -tl --telegram-levels <chat_info> <chat_warning> <chat_error> set a different telegram chat ids for different severity
 -v  --verbose enable verbose installation
 -e  --exporter <port> enable prometheus exporter on <port> (port optional, default 9001)"

A few examples of the installation with optional flags:

Install with --git flag to get alerts on new node releases (in this case celestia-node)

curl -s https://raw.githubusercontent.com/openbitlab/srvcheck/main/install.sh | bash -s -- -t <tg_chat_id> <tg_token> -s <service_name> --git celestiaorg/celestia-node

Install with --admin and --gov flags to be tagged once new proposals are out

curl -s https://raw.githubusercontent.com/openbitlab/srvcheck/main/install.sh | bash -s -- -t <tg_chat_id> <tg_token> -s <service_name> --admin @MyTelegramUsername --gov

Outcomes

The following screenshots represent the chat outputs when the monitor is triggered by predetermined events.

Celestia node detection and task activation

Daily stats

System usage charts (in the last month or since node setup)

Configuration

Edit /etc/srvcheck.conf:

; telegram notifications 
[notification.telegram]
enabled = true
apiToken = 
chatIds = 
infoLevelChatId =
warningLevelChatId =
errorLevelChatId =

; a dummy notification wich prints to stdout
[notification.dummy]
enabled = true

; chain settings
[chain]
; name to be displayed on notifications
name = 
; chain type (e.g. "tendermint" | "substrate")
type = 
; systemd service name
service = 
; endpoint uri, if different from default
endpoint = 
; block time
blockTime =
activeSet = 
thresholdNotsigned = 
criticalThresholdNotsigned = 
blockWindow = 
; Github repository (org/repo)
ghRepository = 
; software version
localVersion = 
; validator address
validatorAddress = 
; mount point
mountPoint = 

; task specific settings
[tasks]
; comma separated list of disabled tasks
disabled = TaskTendermintNewProposal
; enable auto recovery
autoRecover = true 
; Governance administrator (proposal voting, with @), optional
govAdmin =
; Prometheus exporter port
exporterPort =

Prometheus custom exporter: metrics

A custom exporter has been developed to export metrics related to Celestia node with a fixed scraping frequency of 15s, specifically the following metrics are exported:

Name Description Type
peers_count Number of peers connected to the node Guage
node_height Node height Guage
network_height Network height Guage
out_of_sync_counter Incremental value to indicate how many times the node result in syncing state Counter
first_header Height of the first processed header in the latest block range Guage
latest_header Height of the latest processed header in the latest block range Guage
finished_s Processing time of the latest block range Guage
errors Number of errors encountered during the processing of the latest block range Guage

Credits

Made with love by the Openbitlab team

License

Read the LICENSE file.