Open ltetrel opened 5 years ago
Prometheus (k8 deployment with manual configs) + Grafana (used Helm). I drafted an ugly dashboard to visualize some of the metrics from BinderHub, you can see mybinder's grafana here :) Grafana is running on localhost:3000 (rn on node3), I ssh-forwarded it to my laptop to see what's going on.
Prometheus and Grafana are under the namespace monitoring
. You can see the services and pods:
# Pods responsible for monitoring
kubectl get pods -n monitoring
# Services responsible for monitoring
kubectl get svc -n monitoring
# Describe prometheus
kubectl describe pods prometheus-deployment-bc795b5f4-vz56s -n monitoring
I just wondered if in-progress info was accurate and started a build, it looks like working good:
Nice ! so how do we access this now ? Maybe we can create a subdomain (binder-graphana.conp.cloud) to view it from the outside ?
I think we don't have to bother exposing it to an external IP for now. Plus grafana also requires to set an SMTP server, so that we can add users & set alerts and such. For a while to ensure that it works properly we can do forwarding. You can do the following after ssh
:
Get grafana pod name:
export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=grafana,release=grafana" -o jsonpath="{.items[0].metadata.name}")
Forward grafana to port 3000
:
kubectl --namespace monitoring port-forward $POD_NAME 3000
Open a new terminal and forward port 3000 of remote to your local through ssh tunneling:
ssh -L 4000:127.0.0.1:3000 ubuntu@binder.conp.cloud
On your local just visit http://localhost:4000. If it asks for a password, ping me on slack :)
PS Prometheus
is accessible as a service on port 30000
, you can forward it to another localhost port of your machine (e.g. 4001 🤷♂) and see them side by side.
A simple binder to test Dashboard in-progress launch:
Is there a way to automatize this step ? And also you mentionned it is running on node 3, so this grafana will work just for this node, or it will benchmark all the other nodes also ?
Prometheus scrapes metrics from the endpoints of all the nodes, you can see them on the Prometheus dashboard/status (port 30000). Grafana has access to all of them to use as data source.
The installations are on the k8 master node, so it should be up to load balancer to select a worker for the service. To let Grafana know, I had to use the IP of the Prometheus k8 service (the address that shows up when you do kubectl describe svc -n monitoring
), if this IP is not dependent on the node on which it is spawned, it should make it through load balancing.
By automatizing, if you mean making it permanently present at a given address, sure.
by automatizing, I mean just reduce the burden of the current method (Get grafana pod name, Forward grafana to port 3000) :)
Sure, we can expose it as a k8 service and access it over a DNS or IP right away to free you from the burden of the current method :)
Hey @agahkarakuzu, Do you have some documentation on how to setup the grafana (if we want to do it for other servers), since we are working with @mathieuboudreau to make a reproducible environment to create new binderhub instances on our allocation. Also do you have docs that explains how to use your grafana ?
Thanks :)
Hey @ltetrel I don't have a through documentation, I just used Helms and some custom k8 configs. Resources I used are on the master. I had some handwritten notes, which are 9000km away from me atm :/ I can do that later, but if it is super urgent, I am afraid that I won't be available for a while.
To interact with grafana, you can refer to https://github.com/neurolibre/neurolibre-binderhub/issues/5#issuecomment-539133153 . As for usage, it is super intuitive. You can create/modify dashboards effortlessly. Once you login, the rest will make itself clear to you I believe.
ok thanks, we will work on the documentation later then
Since I run your commands, we cannot access anymore https://binder.conp.cloud Do you have some ideas ? maybe the port forwarding is not safe with binderhub ? Did you tested it on the master node or just on your side?
I am not sure if it is because of Arbutus outage last days or these commands :/ From cc side they told me everything is fine on their side...
kubectl --namespace monitoring port-forward $POD_NAME 3000
There is nothing dangerous with port forwarding, all the services in cluster are doing it all the time. I used this many times without any problem. On the other hand, it has nothing to do with BinderHub itself, other than that they are both running on the same node.
Grafana to see platform usage