The server could hang for different reasons
Once we notice that after 3 hours after the accident. (Redis failded)
Today we notice that server is down, when I can't get any data while writing a blog-post
As a Admin I want to control the server status and to have a auto-restore system, so I'm sure my users are happy.
Acceptance criteria
[x] there is a monitoring system that check if the server and all related daemons are running well
Specstore
Rawstore
web-server
... what else?
[x] monitoring system notifies administrator if something crashes
[ ] monitoring system tries to restore failed daemons or docker instances or whatever
Issue in the Frontend repo
The server could hang for different reasons Once we notice that after 3 hours after the accident. (Redis failded) Today we notice that server is down, when I can't get any data while writing a blog-post
As a Admin I want to control the server status and to have a auto-restore system, so I'm sure my users are happy.
Acceptance criteria
Tasks
Tests:
data push
HUGE data in several threadsAnalysis