Closed krptg0 closed 9 months ago
This looks the same as #13527. Please read through that issue to see if the workaround helps.
Yup, that's exactly the same issue. The workaround using docker.io/rkachach/ceph:v18.2.1_patched_v1
did work.
Yup, that's exactly the same issue. The workaround using
docker.io/rkachach/ceph:v18.2.1_patched_v1
did work.
Good to hear it worked, will close this issue then
@travisn & @krptg0 any news on this?
@R-Studio it has been fixed in v18.2.2 I think, everything works out the box now, with the latest releases (that's why the issue's been closed)
@krptg0 Thank you very much, I will update it next week. 😉
Is this a bug report or feature request?
Deviation from expected behavior: Only part of the wanted metrics are showing up in Grafana.
Expected behavior: All metrics should show up How to reproduce it (minimal and precise):
Don't really know if it's tied to updating from 1.12 File(s) to submit:
Cluster CR (custom resource), typically called
cluster.yaml
, if necessaryLogs to submit:
Operator's logs, if necessary
Crashing pod(s) logs, if necessary
To get logs, use
kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use theinsert code
button from the Github UI. Read GitHub documentation if you need help.Cluster Status to submit:
Output of kubectl commands, if necessary
To get the health of the cluster, use
kubectl rook-ceph health
To get the status of the cluster, usekubectl rook-ceph ceph status
For more details, see the Rook kubectl PluginEnvironment:
uname -a
):rook version
inside of a Rook Pod): 1.13.2ceph -v
): 18.2.1kubectl version
):ceph health
in the Rook Ceph toolbox): OKAfter updating, I still have 2 running mgr's. One of them (not always the active one), have Prometheus enabled, and I can
curl localhost:9283
from within the pod. First clue is the HTTP answer : Not really sure why the metrics are empty.The other one simply denies my request :
My Prometheus instance (I tried my historical one in NS "monitoring" from KPS Helm Chart, and the one provided by Rook documentation, leveraging the Prometheus operator), tells me "Connection refused".
Since the new dashboard is also relying on a Prometheus instance to retrieve metrics for the main Graph, my Dashboard is currently empty and I can't follow anything going on with the cluster.
ceph-exporter pods are working as intended and are scraped as intended by the "externel" (from rook-ceph NS) Prometheus. I didn't change any configuration on this.
EDIT: mgr.a which is the one not responding with Connection Refused, just gave me this :
EDIT2:
Just rolled back to 18.2.0, metrics are back. Issue is up in Ceph tracker : https://tracker.ceph.com/issues/64051