There is no monitoring of the infrastructure whatsoever currently. The main need to have some sort of monitoring is alerting when any of the critical services stops working as expected. This may include:
The machine hosting the service is not reachable.
The machine hosting the service run out of available storage.
The machine hosting the service cannot perform backups.
The processes providing the service are not running.
Certain application errors are logged.
Certain health checks are failing.
Others?
Critical services that would require monitoring:
[ ] Matrix server and Discord bridge, used by several people on a daily basis to stay in touch with the community.
Less critical services that would benefit from monitoring:
There is no monitoring of the infrastructure whatsoever currently. The main need to have some sort of monitoring is alerting when any of the critical services stops working as expected. This may include:
Critical services that would require monitoring:
Less critical services that would benefit from monitoring: