canonical / cos-proxy-operator

A machine charm that provides a single integration point in the machine world with the Kubernetes-based COS bundle.
https://charmhub.io/cos-proxy
Apache License 2.0
2 stars 12 forks source link

Unable to relate prometheus to cos-proxy:downstream-prometheus-scrape #147

Closed drencrom closed 2 weeks ago

drencrom commented 2 months ago

Bug Description

When running juju relate prometheus cos-proxy:downstream-prometheus-scrape it fails with this error: ERROR no relations found

To Reproduce

Steps to reproduce:

juju add-model observed
juju consume cos-k8s-controller:admin/cos.loki
juju consume cos-k8s-controller:admin/cos.grafana
juju consume cos-k8s-controller:admin/cos.prometheus  
juju deploy mysql-innodb-cluster -n 3
juju deploy cos-proxy
juju deploy telegraf
juju relate telegraf:prometheus-client cos-proxy:prometheus-target
juju relate telegraf:prometheus-rules cos-proxy:prometheus-rules
juju relate telegraf:dashboards cos-proxy:dashboards
juju relate telegraf mysql-innodb-cluster 
juju relate prometheus cos-proxy:downstream-prometheus-scrape

Bundle:

default-base: ubuntu@22.04/stable
saas:
  grafana:
    url: cos-k8s-controller:admin/cos.grafana
  loki:
    url: cos-k8s-controller:admin/cos.loki
  prometheus:
    url: cos-k8s-controller:admin/cos.prometheus
applications:
  cos-proxy:
    charm: cos-proxy
    channel: latest/stable
    revision: 82
    base: ubuntu@20.04/stable
    num_units: 1
    to:
    - "0"
    constraints: arch=amd64
  mysql-innodb-cluster:
    charm: mysql-innodb-cluster
    channel: 8.0/stable
    revision: 133
    resources:
      mysql-shell: 0
    num_units: 3
    to:
    - "1"
    - "2"
    - "3"
    constraints: arch=amd64
  telegraf:
    charm: telegraf
    channel: latest/stable
    revision: 75
machines:
  "0":
    constraints: arch=amd64
    base: ubuntu@20.04/stable
  "1":
    constraints: arch=amd64
  "2":
    constraints: arch=amd64
  "3":
    constraints: arch=amd64
relations:
- - grafana:grafana-dashboard
  - cos-proxy:downstream-grafana-dashboard
- - telegraf:prometheus-client
  - cos-proxy:prometheus-target
- - telegraf:prometheus-rules
  - cos-proxy:prometheus-rules
- - telegraf:dashboards
  - cos-proxy:dashboards
- - telegraf:mysql-monitor
  - mysql-innodb-cluster:db-monitor

Environment

Juju is running over an openstack cloud. Juju version is 3.3.4

Relevant log output

unit-cos-proxy-0: 13:19:52 INFO juju.worker.uniter awaiting error resolution for "relation-changed" hook
unit-cos-proxy-0: 13:19:53 ERROR unit.cos-proxy/0.juju-log prometheus-target:3: Uncaught exception while in charm code:
Traceback (most recent call last):
  File "./src/charm.py", line 643, in <module>
    main(COSProxyCharm)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/main.py", line 546, in main
    manager = _Manager(charm_class, use_juju_for_storage=use_juju_for_storage)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/main.py", line 429, in __init__
    self.charm = self._make_charm(self.framework, self.dispatcher)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/ops/main.py", line 432, in _make_charm
    charm = self._charm_class(framework)
  File "./src/charm.py", line 133, in __init__
    scrape_configs=self._get_scrape_configs(),
  File "./src/charm.py", line 264, in _get_scrape_configs
    target_job = ScrapeJobModel(**target_job_data)
  File "/var/lib/juju/agents/unit-cos-proxy-0/charm/venv/pydantic/main.py", line 176, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for ScrapeJobModel
metrics_path
  Field required [type=missing, input_value={'job_name': 'juju_observ...nce', 'regex': '(.*)'}]}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.7/v/missing
unit-cos-proxy-0: 13:19:53 ERROR juju.worker.uniter.operation hook "prometheus-target-relation-changed" (via hook dispatching script: dispatch) failed: exit status 1

Additional context

Output of juju status:

Model     Controller     Cloud/Region       Version  SLA          Timestamp
observed  jorge-merlino  stsstack/stsstack  3.3.4    unsupported  13:29:03-03:00

SAAS        Status  Store               URL
grafana     active  cos-k8s-controller  admin/cos.grafana
loki        active  cos-k8s-controller  admin/cos.loki
prometheus  active  cos-k8s-controller  admin/cos.prometheus

App                   Version  Status  Scale  Charm                 Channel        Rev  Exposed  Message
cos-proxy             n/a      error       1  cos-proxy             latest/stable   82  no       hook failed: "prometheus-target-relation-changed"
mysql-innodb-cluster  8.0.37   active      3  mysql-innodb-cluster  8.0/stable     133  no       Unit is ready: Mode: R/O, Cluster is ONLINE and can tolerate up to ONE failure.
telegraf                       active      3  telegraf              latest/stable   75  no       Monitoring mysql-innodb-cluster/2 (source version/commit 23.10)

Unit                     Workload  Agent  Machine  Public address  Ports     Message
cos-proxy/0*             error     idle   0        10.5.3.3                  hook failed: "prometheus-target-relation-changed" for telegraf:prometheus-client
mysql-innodb-cluster/0   active    idle   1        10.5.2.20                 Unit is ready: Mode: R/O, Cluster is ONLINE and can tolerate up to ONE failure.
  telegraf/2             active    idle            10.5.2.20       9103/tcp  Monitoring mysql-innodb-cluster/0 (source version/commit 23.10)
mysql-innodb-cluster/1*  active    idle   2        10.5.3.6                  Unit is ready: Mode: R/W, Cluster is ONLINE and can tolerate up to ONE failure.
  telegraf/1             active    idle            10.5.3.6        9103/tcp  Monitoring mysql-innodb-cluster/1 (source version/commit 23.10)
mysql-innodb-cluster/2   active    idle   3        10.5.1.25                 Unit is ready: Mode: R/O, Cluster is ONLINE and can tolerate up to ONE failure.
  telegraf/0*            active    idle            10.5.1.25       9103/tcp  Monitoring mysql-innodb-cluster/2 (source version/commit 23.10)

Machine  State    Address    Inst id                               Base          AZ    Message
0        started  10.5.3.3   38680f7d-594b-4f30-b404-57f98fbf716e  ubuntu@20.04  nova  ACTIVE
1        started  10.5.2.20  930f9105-18d3-4e7f-ab3b-5fb0573341c3  ubuntu@22.04  nova  ACTIVE
2        started  10.5.3.6   50c18354-3b15-4116-82a6-9bb60769f05f  ubuntu@22.04  nova  ACTIVE
3        started  10.5.1.25  e1b52cde-1971-4f88-b2eb-5e1df57eccc4  ubuntu@22.04  nova  ACTIVE

Integration provider                    Requirer                          Interface             Type         Message
cos-proxy:downstream-grafana-dashboard  grafana:grafana-dashboard         grafana_dashboard     regular      
mysql-innodb-cluster:cluster            mysql-innodb-cluster:cluster      mysql-innodb-cluster  peer         
mysql-innodb-cluster:coordinator        mysql-innodb-cluster:coordinator  coordinator           peer         
mysql-innodb-cluster:db-monitor         telegraf:mysql-monitor            mysql-monitor         subordinate  
telegraf:dashboards                     cos-proxy:dashboards              grafana-dashboard     regular      
telegraf:prometheus-client              cos-proxy:prometheus-target       http                  regular      
telegraf:prometheus-rules               cos-proxy:prometheus-rules        prometheus-rules      regular      
lucabello commented 2 months ago

Hey :) Just a couple questions:

drencrom commented 1 month ago

Hi @lucabello

These are the offer commands:

juju offer loki:logging
juju offer prometheus:receive-remote-write
juju offer grafana:grafana-dashboard

I don't know if the log is related or not. I thought it may be because there are the only errors I see in the log so I added it. Do you know why the no relations found error can occur here?

lucabello commented 2 weeks ago

You're getting the no relations found error because you're supposed to relate cos-proxy-operator:downstream-prometheus-scrape to prometheus:metrics-endpoint, not prometheus:receive-remote-write.

Closing, but please feel free to reopen if needed!

drencrom commented 1 week ago

Hi @lucabello If I try your suggestion I get this error:

» juju relate prometheus:metrics-endpoint cos-proxy:downstream-prometheus-scrape                                                                                                                
ERROR saas application "prometheus" has no "metrics-endpoint" relation
sed-i commented 3 days ago

@drencrom, you can offer multiple endpoints on the same offer. From the error it sounds like maybe only remote-write was offered?

drencrom commented 3 days ago

That was it! Thanks @sed-i