kinvolk / lokomotive

🪦 DISCONTINUED Further Lokomotive development has been discontinued. Lokomotive is a 100% open-source, easy to use and secure Kubernetes distribution from the volks at Kinvolk
https://kinvolk.io/lokomotive-kubernetes/
Apache License 2.0
321 stars 49 forks source link

Monitoring for the Storage Systems #398

Open surajssd opened 4 years ago

surajssd commented 4 years ago

Right now we use OpenEBS as a default storage for Prometheus and Alertmanager. We can also use Rook Ceph for that as mentioned in #381.

But the essential question is where do we store the metrics of the storage provider? If we store the metrics of Rook Ceph in the Prometheus that is backed by Rook Ceph can cause issues. This cyclic dependency raises a question of what happens when the storage we use for Prometheus goes haywire and now we have no metrics to figure out what went wrong since the storage backing the monitoring system is unresponsive hence the monitoring is unresponsive.

The rook docs highlight this problem as follows:

NOTE: It is not recommended to consume storage from the Ceph cluster for Prometheus. If the Ceph cluster fails, Prometheus would become unresponsive and thus not alert you of the failure.

src: https://rook.io/docs/rook/v1.3/ceph-monitoring.html#prometheus-instances

Lokomotive should provide a way to monitor the storage systems different from the monitoring system on the cluster used for everything else.

surajssd commented 4 years ago

Commutatively this issue can also be titled as Storage for Monitoring Systems.