canonical / cos-proxy-operator

A machine charm that provides a single integration point in the machine world with the Kubernetes-based COS bundle.
https://charmhub.io/cos-proxy
Apache License 2.0
2 stars 12 forks source link

cos-proxy goes to `Blocked` if it is configured to be able to send data downstream but does not have any sources of that data #144

Closed ca-scribner closed 2 months ago

ca-scribner commented 2 months ago

Bug Description

As reported by Nishant, he has cos-proxy in a blocked state:

(truncated)

App                   Version  Status   Scale  Charm                       Channel        Rev  Exposed  Message
cos-proxy             n/a      blocked      1  cos-proxy                   latest/edge     81  no       Missing one of (Grafana|dashboard|grafana-agent) relation(s)
grafana-agent                  active      13  grafana-agent               latest/stable   65  no       
landscape-client               active       9  landscape-client            latest/stable   69  no       Client registered!
ntp                   4.2      active      12  ntp                         latest/stable   50  no       chrony: Ready
ubuntu-advantage               active      13  ubuntu-advantage            latest/stable   79  no       Attached (esm-apps,esm-infra,livepatch)

Unit                     Workload  Agent  Machine  Public address  Ports           Message
cos-proxy/0*             blocked   idle   12       10.27.64.20                     Missing one of (Grafana|dashboard|grafana-agent) relation(s)
  grafana-agent/10       active    idle            10.27.64.20                     
  landscape-client/8     active    idle            10.27.64.20                     Client registered!
  ntp/13                 active    idle            10.27.64.20     123/udp         chrony: Ready
  ubuntu-advantage/8     active    idle            10.27.64.20                     Attached (esm-apps,esm-infra,livepatch)

Integration provider                                      Requirer                                                     Interface                Type         Message
cos-proxy:cos-agent                                       grafana-agent:cos-agent                                      cos_agent                subordinate  
cos-proxy:downstream-grafana-dashboard                    cos-grafana-dashboards:grafana-dashboard                     grafana_dashboard        regular      
cos-proxy:downstream-prometheus-scrape                    cos-scrape-interval-config-metrics:configurable-scrape-jobs  prometheus_scrape        regular      
cos-proxy:juju-info                                       landscape-client:container                                   juju-info                subordinate  
cos-proxy:juju-info                                       ntp:juju-info                                                juju-info                subordinate  
cos-proxy:juju-info                                       ubuntu-advantage:juju-info                                   juju-info                subordinate  
nrpe:monitors                                             cos-proxy:monitors                                           monitors                 regular      

This deployment has

If I understand correctly then, the issue is that cos-proxy is configured to be able to send dashboards out to cos-grafana-dashboards via the downstream-grafana-dashboard relation, but doesn't currently have a source of dashboards, and according to the charm that should be a Blocked status.

My first intuition is that this is not a bug exactly, but maybe a design decision we should revisit. Essentially the charm is raising Blocked if it is properly configured but unused, which feels odd. I'll check with others though as I might misunderstand this.

FWIW this pr that landed in March changed this logic for rev >=73, and 73 landed into latest/stable about a month ago, so there's a decent chance this is all related.

If making a change here, we should also revisit the logic for the other required data sources (loki, prometheus) as they probably have the same decisions.

To Reproduce

Not reproduced, but juju status output has detail to reproduce if needed

Environment

See juju status output

Relevant log output

See above

Additional context

No response

nishant-dash commented 2 months ago

I currently have cos proxy deployed but its blocked

cos-proxy/0*          blocked   idle   12       w.x.y.z              Missing one of (Grafana|dashboard|grafana-agent) relation(s)

The 3 relations in question seem to be over the interfaces:

I have relations that use the 2nd and 3rd relation but not the first as I have no charm deployed that can make use of it. (For example, the etcd charm can use it as

etcd:grafana  cos-proxy:dashboards   grafana-dashboard  regular  

but I have no charm deployed in my env that can use this interface so cos-proxy remains blocked. ) this is with cos proxy latest/edge 81 using juju 3.5.1

nishant-dash commented 2 months ago

I observed that cos_agent seems to trigger this

$ juju status --relations | grep -i cos-proxy-monitors
cos-proxy-monitors                     n/a              blocked      1  cos-proxy                                    latest/stable   73  no       Missing one of (Grafana|dashboard|grafana-agent) relation(s)
cos-proxy-monitors/0*                       blocked   idle   0/lxd/2  w.x.y.z                   Missing one of (Grafana|dashboard|grafana-agent) relation(s)
cos-proxy-monitors  cos-proxy-monitors  cos-proxy  73   0/0        monitors  monitors   requirer
cos-proxy-monitors:cos-agent                                         grafana-agent-container:cos-agent                             cos_agent                       subordinate  
cos-proxy-monitors:downstream-prometheus-scrape                      cos-scrape-interval-config-monitors:configurable-scrape-jobs  prometheus_scrape               regular      
nrpe-container:monitors                                              cos-proxy-monitors:monitors                                   monitors                        regular      
nrpe-host:monitors                                                   cos-proxy-monitors:monitors                                   monitors                        regular      

then i remove the relation and it goes into active/idle

juju remove-relation cos-proxy-monitors:cos-agent                                         grafana-agent-container:cos-agent
$ juju status --relations | grep -i cos-proxy-monitors
cos-proxy-monitors                     n/a              active       1  cos-proxy                                    latest/stable   73  no       
cos-proxy-monitors/0*                       active    idle       0/lxd/2  w.x.y.z                   
cos-proxy-monitors  cos-proxy-monitors  cos-proxy  73   0/0        monitors  monitors   requirer
cos-proxy-monitors:cos-agent                                         grafana-agent-container:cos-agent                             cos_agent                       subordinate  
cos-proxy-monitors:downstream-prometheus-scrape                      cos-scrape-interval-config-monitors:configurable-scrape-jobs  prometheus_scrape               regular      
nrpe-container:monitors                                              cos-proxy-monitors:monitors                                   monitors                        regular      
nrpe-host:monitors                                                   cos-proxy-monitors:monitors                                   monitors                        regular