canonical / grafana-agent-operator

This charmed operator automates the operational procedures of running Grafana Agent, an open-soruce telemetry collector.
https://charmhub.io/grafana-agent
Apache License 2.0
4 stars 8 forks source link

Incorrect labels after relating to grafana-agent rev 38 #61

Closed dstathis closed 4 months ago

dstathis commented 4 months ago

Bug Description

https://github.com/canonical/hardware-observer-operator/issues/158

To Reproduce

-

Environment

-

Relevant log output

-

Additional context

No response

gabrielcocenza commented 4 months ago

Just to complement the bug this unit label issue seems to be happening when a subordinate charm like hardware-observer relates to grafana-agent. E.g:

Model            Controller  Cloud/Region         Version  SLA          Timestamp
test-charm-z6pi  overlord    localhost/localhost  2.9.43   unsupported  09:37:08-03:00

App                Version  Status   Scale  Charm              Channel  Rev  Exposed  Message
grafana-agent               blocked      1  grafana-agent      edge      52  no       send-remote-write: off, grafana-cloud-config: off, logging-consumer: off
hardware-observer           active       1  hardware-observer             0  no       Unit is ready
ubuntu             22.04    active       1  ubuntu             stable    24  no       

Unit                    Workload  Agent  Machine  Public address  Ports  Message
ubuntu/0*               active    idle   0        10.101.82.225          
  grafana-agent/0*      blocked   idle            10.101.82.225          send-remote-write: off, grafana-cloud-config: off, logging-consumer: off
  hardware-observer/1*  active    idle            10.101.82.225          Unit is ready

Machine  State    Address        Inst id        Series  AZ  Message
0        started  10.101.82.225  juju-c1881f-0  jammy       Running

On revision 28, grafana-agent generates the config file (/etc/grafana-agent.yaml) that includes the unit label of ubuntu/0, but this does not happen on newer releases.

Abuelodelanada commented 4 months ago

Debugging session.

With the following deployment:

╭─ubuntu@charm-dev-juju-34 ~ [lxd:hwo]
╰─$ jst                  
Model  Controller  Cloud/Region         Version  SLA          Timestamp
hwo    lxd         localhost/localhost  3.4.0    unsupported  16:45:56-03:00

SAAS                             Status  Store     URL
grafana-dashboards               active  microk8s  admin/cos.grafana-dashboards
loki-logging                     active  microk8s  admin/cos.loki-logging
prometheus-receive-remote-write  active  microk8s  admin/cos.prometheus-receive-remote-write

App     Version  Status  Scale  Charm              Channel  Rev  Exposed  Message
agent            active      1  grafana-agent      edge      54  no       
hwo              active      1  hardware-observer  edge      38  no       Unit is ready
ubuntu  22.04    active      1  ubuntu             stable    24  no       

Unit        Workload  Agent  Machine  Public address  Ports  Message
ubuntu/0*   active    idle   0        10.10.106.208          
  agent/3*  active    idle            10.10.106.208          
  hwo/0*    active    idle            10.10.106.208          Unit is ready

Machine  State    Address        Inst id        Base          AZ  Message
0        started  10.10.106.208  juju-261ede-0  ubuntu@22.04      Running

Integration provider                                  Requirer                              Interface                Type         Message
agent:grafana-dashboards-provider                     grafana-dashboards:grafana-dashboard  grafana_dashboard        regular      
agent:peers                                           agent:peers                           grafana_agent_replica    peer         
hwo:cos-agent                                         agent:cos-agent                       cos_agent                subordinate  
loki-logging:logging                                  agent:logging-consumer                loki_push_api            regular      
prometheus-receive-remote-write:receive-remote-write  agent:send-remote-write               prometheus_remote_write  regular      
ubuntu:juju-info                                      agent:juju-info                       juju-info                subordinate  
ubuntu:juju-info                                      hwo:general-info                      juju-info                subordinate  

The metrics section in grafana-agent is:

╭─ubuntu@charm-dev-juju-34 ~ [lxd:hwo]
╰─$ juju ssh agent/3 "cat /etc/grafana-agent.yaml"                                                                                      

...

metrics:
  configs:
  - name: agent_scraper
    remote_write:
    - tls_config:
        insecure_skip_verify: false
      url: http://192.168.1.250/cos-prometheus-0/api/v1/write
    scrape_configs:
    - job_name: hwo_0_default
      metrics_path: /metrics
      static_configs:
      - labels:
          juju_application: hwo
          juju_model: hwo
          juju_model_uuid: 71de48c1-0f16-40b7-8ed3-87748d261ede
        targets:
        - localhost:10200

...

After adding one more unit to the ubuntu application:

╭─ubuntu@charm-dev-juju-34 ~ [lxd:hwo]
╰─$ jst                 
Model  Controller  Cloud/Region         Version  SLA          Timestamp
hwo    lxd         localhost/localhost  3.4.0    unsupported  17:02:05-03:00

SAAS                             Status  Store     URL
grafana-dashboards               active  microk8s  admin/cos.grafana-dashboards
loki-logging                     active  microk8s  admin/cos.loki-logging
prometheus-receive-remote-write  active  microk8s  admin/cos.prometheus-receive-remote-write

App     Version  Status  Scale  Charm              Channel  Rev  Exposed  Message
agent            active      2  grafana-agent      edge      54  no       
hwo              active      2  hardware-observer  edge      38  no       Unit is ready
ubuntu           active      2  ubuntu             stable    24  no       

Unit        Workload  Agent  Machine  Public address  Ports  Message
ubuntu/0*   active    idle   0        10.10.106.208          
  agent/3*  active    idle            10.10.106.208          
  hwo/0*    active    idle            10.10.106.208          Unit is ready
ubuntu/2    active    idle   2        10.10.106.44           
  agent/4   active    idle            10.10.106.44           
  hwo/3     active    idle            10.10.106.44           Unit is ready

Machine  State    Address        Inst id        Base          AZ  Message
0        started  10.10.106.208  juju-261ede-0  ubuntu@22.04      Running
2        started  10.10.106.44   juju-261ede-2  ubuntu@22.04      Running

Integration provider                                  Requirer                              Interface                Type         Message
agent:grafana-dashboards-provider                     grafana-dashboards:grafana-dashboard  grafana_dashboard        regular      
agent:peers                                           agent:peers                           grafana_agent_replica    peer         
hwo:cos-agent                                         agent:cos-agent                       cos_agent                subordinate  
loki-logging:logging                                  agent:logging-consumer                loki_push_api            regular      
prometheus-receive-remote-write:receive-remote-write  agent:send-remote-write               prometheus_remote_write  regular      
ubuntu:juju-info                                      agent:juju-info                       juju-info                subordinate  
ubuntu:juju-info                                      hwo:general-info                      juju-info                subordinate 

The metrics section in both grafana-agent config files is the same:

╭─ubuntu@charm-dev-juju-34 ~ [lxd:hwo]
╰─$ juju ssh agent/3 "cat /etc/grafana-agent.yaml"
...
metrics:
  configs:
  - name: agent_scraper
    remote_write:
    - tls_config:
        insecure_skip_verify: false
      url: http://192.168.1.250/cos-prometheus-0/api/v1/write
    scrape_configs:
    - job_name: hwo_0_default
      metrics_path: /metrics
      static_configs:
      - labels:
          juju_application: hwo
          juju_model: hwo
          juju_model_uuid: 71de48c1-0f16-40b7-8ed3-87748d261ede
        targets:
        - localhost:10200
...
╭─ubuntu@charm-dev-juju-34 ~ [lxd:hwo]
╰─$ juju ssh agent/4 "cat /etc/grafana-agent.yaml"
...
metrics:
  configs:
  - name: agent_scraper
    remote_write:
    - tls_config:
        insecure_skip_verify: false
      url: http://192.168.1.250/cos-prometheus-0/api/v1/write
    scrape_configs:
    - job_name: hwo_0_default
      metrics_path: /metrics
      static_configs:
      - labels:
          juju_application: hwo
          juju_model: hwo
          juju_model_uuid: 71de48c1-0f16-40b7-8ed3-87748d261ede
        targets:
        - localhost:10200
...