canonical / kubeflow-examples

Charmed Kubeflow examples
Apache License 2.0
32 stars 9 forks source link

Switch to using grafana-agent-k8s charm for COS integration #41

Closed nishant-dash closed 9 months ago

nishant-dash commented 1 year ago

I would like to propose the switch to using the grafana-agent-k8s charm to integrating with a COS deployment rather than individual cross-model/controller relations.

By doing individual relations we hit 2 issues,

  1. Routing. If Kubeflow is bootstrapped onto one k8s cluster and COS is bootstrapped onto another k8s cluster (it usually is sitting separate in its own microk8s cluster), when you establish cross-controller relations from say, argo to cos' prometheus, you are offering the pod IP of argo, which cos prometheus has no way of reaching.

    For example:

    • You make the cmr

      juju add-relation argo-controller admin/cos.prometheus-scrape
    • but in prometheus, you get http://192.168.1.168:9090/metrics and that will fail since it can not reach that endpoint.

    • With grafana agents k8s, the agent will take care of push write to prometheus and it can reach and will be reachable to the cos prometheus via its cmr.

    1. CMRs
    • If establishing c-m-relations of charms individually, you have a lot of CMRs to manage, versus, with grafana-agent-k8s, you'd just want

      juju add-relation argo-controller grafana-agent-k8s
      # and see that you get this
      argo-controller:metrics-endpoint               grafana-agent-k8s:metrics-endpoint             prometheus_scrape   regular
    • Which is just a local-model relation. The same will be true for every other application in kubeflow. With this, you will only have 1 CMR between grafana-agent-k8s and cos prometheus. The same applies to grafana dashboards and loki logging as well, with the general idea being that you establish local relations between applications and grafana-agents-k8s on prometheus, grafana and loki and the grafana agent will only have 1 CMR for each of these cos applications thereby reducing your total CMR count to just 3.

NohaIhab commented 9 months ago

integration guide with COS was re-written accordingly and published https://charmed-kubeflow.io/docs/integrate-with-cos