Open jhamman opened 6 years ago
side note, as of 8/30, we have google analytics tracking on binder.pangeo.io setup so at least we have some basic diagnostics.
You need to install the prometheus and grafana helm charts.
There is some example config for pangeo.pydata.org in with PR you mentioned.
@jacobtomlinson - that was pretty straightforward. I deployed this using the configs here: 95242a8d4bcc14ae3ac0350bc8b9dbaef5f33d7f. This is live here (for now): http://35.238.151.129/
However, the pangeo dashboard isn't playing nice. Any thoughts on what's going wrong here?
It is probably a namespace thing. The dashboard assumes the kubernetes namespace your Pangeo is in is called pangeo
. We should fix this.
Yes, my namespace is pangeo-binder. I'll take a look.
If you edit each graph you should see where the namespace is specified in the query. Updating this should fix it, but its rather labor intensive sorry!
Thanks @jacobtomlinson! This is mostly working now. We'll want to make some minor changes but this seems to be working.
Possibly off topic question: can we change the skin of the grafana dashboard? I can't stand the black background.
I will leave this open until I have a chance to clean up the dashboard and move the ip address to grafana.binder.pangeo.io.
@rabernat - I don't know for sure but I would image this is possible.
cross referencing to https://github.com/jupyterhub/mybinder.org-deploy/issues/726. It seems like we'll need to scrape the mybinder.org configs from their grafana site.
I'm looking to enable Prometheus / Grafana for the AWS BinderHub, so it'll happen for the GCP BinderHub as well.
@TomAugspurger looks like the CI deployment failed: https://app.circleci.com/pipelines/github/pangeo-data/pangeo-binder/152/workflows/7e61c31d-0a15-45fa-a9e7-4ca381b2a7b9/jobs/159
I think that error is from prometheus-operator
, I've seen GitHub issues talking about it for the CustomResourceDefinitions
: https://github.com/helm/charts/issues/23413. They list a fix in the readme, but that's only supposed to apply for helm > v2.14
. Might need the six lines of kubectl apply -f ...
in .circleci/config.yaml
.
Thanks, looking now.
I'm pretty stumped right now. I wonder if kube-lego is causing issues, if it hasn't been updated to not use extensions/v1beta1 since it's deprecated?
Did we need kube-lego
? I thought ingress-nginx
was replacing it a bit.
For the rest, 4 of the 5 it couldn't find in the GCP deployment are from prometheus-operator
. The 4 in the AWS deployment that it couldn't find are the same as the GCP ones. I can add the CRD install lines to the CI Action.
mybinder.org has https://grafana.mybinder.org. It would be good to combine this config with https://github.com/pangeo-data/pangeo/pull/359 once that is finalized.
@jacobtomlinson - if you can help guide me in the right direction, I'm ready to give this a try.