[x] My branch is up-to-date with upstream/develop branch.
[x] Everything works and tested for Python 3.8.0 and above.
[x] I ran pre-commit checks against my changes.
[x] I've written tests against my changes and all the current present tests are passing.
Current behaviour
Snapshotter-lite does not have a proper health check endpoint in core-api, and it does not currently have any means of externally reporting issues to node operators.
New expected behaviour
The /health endpoint in snapshotter.core-api requests data from the powerloom-reporting-service (if enabled) to determine the health status of the node, and it will return a failed status code if there are any issues with the node. It checks for the last time the node has pinged the reporting service, as well as if there are any issues reported to the service.
Additionally, the node operator now has the option to provide a telegram Chat ID on startup to enable automatic telegram notifications if any issues arise while snapshotting. This feature requires the telegram-reporting-host endpoint to be active and set in the .env.
Change logs
Added
Telegram notifications in the case of missed snapshots and failed epoch processing
Changed
Core-api /health endpoint will monitor the health of the node using data from the reporting service
Updated snapshotter.utils.callback_helpers to include the telegram reporting option
Replaced Snapshotter Issue data model with SnapshotterReportData to allow for additional reporting metrics
Deployment Instructions
Deployment pattern has not changed, but there is now a prompt to enter an optional telegram Chat ID on startup. Users can request a Chat ID by starting a conversation with the @PowerloomReportingBot.
Fixes #
Checklist
Current behaviour
Snapshotter-lite does not have a proper health check endpoint in core-api, and it does not currently have any means of externally reporting issues to node operators.
New expected behaviour
The
/health
endpoint insnapshotter.core-api
requests data from the powerloom-reporting-service (if enabled) to determine the health status of the node, and it will return a failed status code if there are any issues with the node. It checks for the last time the node has pinged the reporting service, as well as if there are any issues reported to the service.Additionally, the node operator now has the option to provide a telegram
Chat ID
on startup to enable automatic telegram notifications if any issues arise while snapshotting. This feature requires the telegram-reporting-host endpoint to be active and set in the.env
.Change logs
Added
Changed
/health
endpoint will monitor the health of the node using data from the reporting servicesnapshotter.utils.callback_helpers
to include the telegram reporting optionSnapshotter Issue
data model withSnapshotterReportData
to allow for additional reporting metricsDeployment Instructions
Deployment pattern has not changed, but there is now a prompt to enter an optional telegram
Chat ID
on startup. Users can request aChat ID
by starting a conversation with the@PowerloomReportingBot
.