Open ca-scribner opened 3 years ago
The answer is "Very Yes". We already have prometheus metrics available and we just need to pull them.
We might be able to work with this:
https://github.com/kubeflow/kubeflow/issues/5216
I think we'd be able to scrape for pods with the label notebook
(like in our minio credential injector), and then scrape metrics using the blackbox exporter that @Ito-Matsuda has been playing around with.
It looks like it's possible to right a cross-namespace ServiceMonitor using that notebook label as a selector.
To help detecting/fixing bugs, let's improve the information we collect from Jupyter Notebooks. Some ideas:
More brainstorming would be helpful here.