canonical / prometheus-k8s-operator

This charmed operator automates the operational procedures of running Prometheus, an open-source metrics backend.
https://charmhub.io/prometheus-k8s
Apache License 2.0
21 stars 34 forks source link

`scrape_jobs` is reset by something #427

Closed simondeziel closed 1 year ago

simondeziel commented 1 year ago

Bug Description

We have a charm that deals with LXD clusters. The intent is to have the app leader (lxd/8* in this example) send the desired scrape_jobs to Prometheus. We only want the leader to send the scrape_jobs because it knows which targets are ready to be scrapped.

This works initially where Prometheus receives this:

$ juju show-unit -m cos prometheus/0
...
  - relation-id: 31
    endpoint: metrics-endpoint
    cross-model: true
    related-endpoint: metrics-endpoint
    application-data:
      scrape_jobs: '[{"metrics_path": "/1.0/metrics", "static_configs": [{"targets":
        ["172.17.33.7:8443", "172.17.33.8:8443"]}], "scheme": "https", "tls_config":
        {"insecure_skip_verify": true}}]'
      scrape_metadata: '{"model": "test", "model_uuid": "7570a55a-fe48-4bdf-8243-e19968a39797",
        "application": "lxd", "unit": "lxd/8", "charm_name": "lxd"}'
    related-units:
      remote-f664a1ba18d544dc86264f34dc57f4ee/8:
        in-scope: true
        data:
          egress-subnets: 2602:fc62:b:3003:0:1:0:2/128
          ingress-address: 2602:fc62:b:3003:0:1:0:2
          private-address: 2602:fc62:b:3003:0:1:0:2
          prometheus_scrape_unit_address: 172.17.33.7
          prometheus_scrape_unit_name: lxd/8
      remote-f664a1ba18d544dc86264f34dc57f4ee/9:
        in-scope: true
        data:
          egress-subnets: 2602:fc62:b:3003:0:1:0:3/128
          ingress-address: 2602:fc62:b:3003:0:1:0:3
          private-address: 2602:fc62:b:3003:0:1:0:3
          prometheus_scrape_unit_address: 172.17.33.8
          prometheus_scrape_unit_name: lxd/9
  provider-id: prometheus-0
  address: 10.1.134.109

Which translates to this config:

$ juju ssh -m cos --container prometheus prometheus/0 cat /etc/prometheus/prometheus.yml
...
- honor_labels: true
  job_name: juju_test_7570a55a_lxd_prometheus_scrape
  metrics_path: /1.0/metrics
  relabel_configs:
  - *id002
  scheme: https
  static_configs:
  - labels:
      juju_application: lxd
      juju_charm: lxd
      juju_model: test
      juju_model_uuid: 7570a55a-fe48-4bdf-8243-e19968a39797
    targets:
    - 172.17.33.7:8443
    - 172.17.33.8:8443
  tls_config:
    insecure_skip_verify: true

However after some time where nothing's done by the operator/admin, the following happens:

$ juju debug-log --level DEBUG -m cos --tail
...
unit-prometheus-0: 19:55:09 INFO unit.prometheus/0.juju-log metrics-endpoint:34: reqs=ResourceRequirements(claims=None, limits={}, requests={'cpu': '0.25', 'memory': '200Mi'}), templated=ResourceRequirements(claims=None, limits=None, requests={'cpu': '250m', 'memory': '200Mi'}), actual=ResourceRequirements(claims=None, limits=None, requests={'cpu': '250m', 'memory': '200Mi'})
unit-prometheus-0: 19:55:10 INFO unit.prometheus/0.juju-log metrics-endpoint:34: Pushed new configuration
unit-prometheus-0: 19:55:12 INFO unit.prometheus/0.juju-log metrics-endpoint:34: Prometheus configuration reloaded

Which causes Prometheus to loose the scrape_jobs:

$ juju show-unit -m cos prometheus/0
...
  - relation-id: 34
    endpoint: metrics-endpoint
    cross-model: true
    related-endpoint: metrics-endpoint
    application-data:
      scrape_jobs: '[{"metrics_path": "/metrics", "static_configs": [{"targets": ["*:80"]}]}]'
      scrape_metadata: '{"model": "test", "model_uuid": "7570a55a-fe48-4bdf-8243-e19968a39797",
        "application": "lxd", "unit": "lxd/8", "charm_name": "lxd"}'
    related-units:
      remote-f664a1ba18d544dc86264f34dc57f4ee/8:
        in-scope: true
        data:
          egress-subnets: 2602:fc62:b:3003:0:1:0:2/128
          ingress-address: 2602:fc62:b:3003:0:1:0:2
          private-address: 2602:fc62:b:3003:0:1:0:2
          prometheus_scrape_unit_address: 172.17.33.7
          prometheus_scrape_unit_name: lxd/8
      remote-f664a1ba18d544dc86264f34dc57f4ee/9:
        in-scope: true
        data:
          egress-subnets: 2602:fc62:b:3003:0:1:0:3/128
          ingress-address: 2602:fc62:b:3003:0:1:0:3
          private-address: 2602:fc62:b:3003:0:1:0:3
          prometheus_scrape_unit_address: 172.17.33.8
          prometheus_scrape_unit_name: lxd/9
  provider-id: prometheus-0
  address: 10.1.134.109

Which causes the Prometheus config to be reverted:

$ juju ssh -m cos --container prometheus prometheus/0 cat /etc/prometheus/prometheus.yml
- honor_labels: true
  job_name: juju_test_7570a55a_lxd_prometheus_scrape-9
  metrics_path: /metrics
  relabel_configs:
  - *id001
  static_configs:
  - labels:
      juju_application: lxd
      juju_charm: lxd
      juju_model: test
      juju_model_uuid: 7570a55a-fe48-4bdf-8243-e19968a39797
      juju_unit: lxd/9
    targets:
    - 172.17.33.8:80
- honor_labels: true
  job_name: juju_test_7570a55a_lxd_prometheus_scrape-8
  metrics_path: /metrics
  relabel_configs:
  - *id001
  static_configs:
  - labels:
      juju_application: lxd
      juju_charm: lxd
      juju_model: test
      juju_model_uuid: 7570a55a-fe48-4bdf-8243-e19968a39797
      juju_unit: lxd/8
    targets:
    - 172.17.33.7:80

To Reproduce

  1. Download our PoC charm for LXD (https://sdeziel.info/pub/lxd_ubuntu-22.04-amd64.charm)
  2. juju deploy -m test ./lxd_ubuntu-22.04-amd64.charm --config lxd-listen-https=true --config mode=cluster
  3. Deploy COS-Lite including prometheus-k8s
  4. juju integrate -m test lxd:metrics-endpoint prometheus-scrape:metrics-endpoint

Environment

cos model:

$ juju status -m cos --relations
Model  Controller  Cloud/Region            Version  SLA          Timestamp
cos    overlord    microk8s-cos/localhost  3.0.2    unsupported  21:39:06Z

App           Version  Status  Scale  Charm             Channel  Rev  Address         Exposed  Message
alertmanager  0.23.0   active      1  alertmanager-k8s  edge      38  10.152.183.20   no       
catalogue              active      1  catalogue-k8s     edge       6  10.152.183.211  no       
grafana       9.2.1    active      1  grafana-k8s       edge      59  10.152.183.118  no       
loki          2.4.1    active      1  loki-k8s          edge      49  10.152.183.135  no       
prometheus    2.33.5   active      1  prometheus-k8s    edge      92  10.152.183.89   no       
traefik                active      1  traefik-k8s       edge     100  172.17.33.1     no       

Unit             Workload  Agent  Address       Ports  Message
alertmanager/0*  active    idle   10.1.134.94          
catalogue/0*     active    idle   10.1.134.116         
grafana/0*       active    idle   10.1.134.97          
loki/0*          active    idle   10.1.134.115         
prometheus/0*    active    idle   10.1.134.109         
traefik/0*       active    idle   10.1.134.126         

Offer                            Application   Charm             Rev  Connected  Endpoint              Interface                Role
alertmanager-karma-dashboard     alertmanager  alertmanager-k8s  38   0/0        karma-dashboard       karma_dashboard          provider
grafana-dashboards               grafana       grafana-k8s       59   0/0        grafana-dashboard     grafana_dashboard        requirer
loki-logging                     loki          loki-k8s          49   0/0        logging               loki_push_api            provider
prometheus-receive-remote-write  prometheus    prometheus-k8s    92   0/0        receive-remote-write  prometheus_remote_write  provider
prometheus-scrape                prometheus    prometheus-k8s    92   1/1        metrics-endpoint      prometheus_scrape        requirer

Relation provider                   Requirer                     Interface              Type     Message
alertmanager:alerting               loki:alertmanager            alertmanager_dispatch  regular  
alertmanager:alerting               prometheus:alertmanager      alertmanager_dispatch  regular  
alertmanager:grafana-dashboard      grafana:grafana-dashboard    grafana_dashboard      regular  
alertmanager:grafana-source         grafana:grafana-source       grafana_datasource     regular  
alertmanager:replicas               alertmanager:replicas        alertmanager_replica   peer     
alertmanager:self-metrics-endpoint  prometheus:metrics-endpoint  prometheus_scrape      regular  
catalogue:catalogue                 alertmanager:catalogue       catalogue              regular  
catalogue:catalogue                 grafana:catalogue            catalogue              regular  
catalogue:catalogue                 prometheus:catalogue         catalogue              regular  
grafana:grafana                     grafana:grafana              grafana_peers          peer     
grafana:metrics-endpoint            prometheus:metrics-endpoint  prometheus_scrape      regular  
loki:grafana-dashboard              grafana:grafana-dashboard    grafana_dashboard      regular  
loki:grafana-source                 grafana:grafana-source       grafana_datasource     regular  
loki:metrics-endpoint               prometheus:metrics-endpoint  prometheus_scrape      regular  
prometheus:grafana-dashboard        grafana:grafana-dashboard    grafana_dashboard      regular  
prometheus:grafana-source           grafana:grafana-source       grafana_datasource     regular  
prometheus:prometheus-peers         prometheus:prometheus-peers  prometheus_peers       peer     
traefik:ingress                     alertmanager:ingress         ingress                regular  
traefik:ingress                     catalogue:ingress            ingress                regular  
traefik:ingress-per-unit            loki:ingress                 ingress_per_unit       regular  
traefik:ingress-per-unit            prometheus:ingress           ingress_per_unit       regular  
traefik:metrics-endpoint            prometheus:metrics-endpoint  prometheus_scrape      regular  
traefik:traefik-route               grafana:ingress              traefik_route          regular

test model:

$ juju status -m test --relations
Model  Controller  Cloud/Region  Version  SLA          Timestamp
test   overlord    maas/default  3.0.2    unsupported  21:38:36Z

SAAS               Status  Store     URL
prometheus-scrape  active  overlord  admin/cos.prometheus-scrape

App  Version  Status  Scale  Charm  Channel  Rev  Exposed  Message
lxd           active      2  lxd               6  no       

Unit    Workload  Agent  Machine  Public address            Ports     Message
lxd/8*  active    idle   8        2602:fc62:b:3003:0:1:0:2  8443/tcp  
lxd/9   active    idle   9        2602:fc62:b:3003:0:1:0:3  8443/tcp  

Machine  State    Address                   Inst id       Base          AZ       Message
8        started  2602:fc62:b:3003:0:1:0:2  r03-amd64-05  ubuntu@22.04  default  Deployed
9        started  2602:fc62:b:3003:0:1:0:3  r03-amd64-02  ubuntu@22.04  default  Deployed

Relation provider     Requirer                            Interface          Type     Message
lxd:cluster           lxd:cluster                         lxd-cluster        peer     
lxd:metrics-endpoint  prometheus-scrape:metrics-endpoint  prometheus_scrape  regular

Relevant log output

I've included the logs I believe were relevant, let me know if some more would be needed.

Additional context

It's quite possible I'm using the prometheus-k8s prometheus_scrape lib wrong... help pointing out where I'm using it wrong would be greatly appreciated!

simskij commented 1 year ago

Given that the lxd databag is empty, this must be happening in the provider end (i.e. lxd). Prometheus would not be allowed by Juju to wipe data from the remote app databag.

Can you point me to the code that generates your scrape targets? Likely, it is being run multiple times with different state, leading to an empty databag.

simondeziel commented 1 year ago

Can you point me to the code that generates your scrape targets? Likely, it is being run multiple times with different state, leading to an empty databag.

Possibly! I just double checked and self.metrics_endpoint.update_scrape_job_spec(jobs=jobs) is called only once by lxd_update_metrics_endpoint_scrape_job which logs a message when it does and I only see it once.

The PoC charm is available at https://sdeziel.info/pub/lxd_ubuntu-22.04-amd64.charm. Thanks for looking into this, it's much appreciated!

simondeziel commented 1 year ago

@simskij Important point I forgot to mention, the LXD charm is a machine one.

rbarry82 commented 1 year ago

I haven't quite had a chance to reproduce this yet, but I did have a chance to go through the charm code after unpacking it.

There is a lookaside_jobs constructor arg which takes a Callable which is added onto the list. MetricsEndpointProvider(..., lookaside_jobs_callable=self.lxd_update_metrics_endpoint_scrape_job) is probably a viable workaround.

That said, this is a bug in our code. Every time any event fires for any reason, the constructor is re-invoked, and it clobbers it as the last thing it does. It was introduced as part of this PR, but was an unreliable, hacky solution to event ordering and relying on external_url being updated by an ingress relation changed after it was already used in the constructor.

Multiple patches have completely eliminated the need for this, and it should be removed here also.

simondeziel commented 1 year ago

@rbarry82 many thanks, and yes, I'm now using the lookaside_jobs which avoids the resetting behaviour. I'm still trying to find a way to not have the default scrape_jobs appended to the list but at least, lookaside_jobs seems to be the right way for us.

I'll look into this comes Monday and will update this issue. Thanks again!

rbarry82 commented 1 year ago

Happy to help! Admittedly, that was added for a different kind of discovery, but I'm glad it works here.

In general, we kind of don't want to support "arbitrary" endpoint discovery straight to Prometheus (HTTP-based, k8s service discovery, or the other "usual" Prometheus methods) because they're not able to be represented in Juju models in any meaningful way, which makes it hard/impossible to export/import models.

However, I don't really think the initial design considered the possibility of a client where we'd want to have a step-between use case. There's prometheus-scrape-target-k8s-operator, but that's intended to add arbitrary/non-Juju endpoints (not ones which are dynamically built), and it's something we could/should support.

Arbitrarily, if lookaside_jobs_callable is truthy and jobs is falsy, this is an easy conditional (it would also need to be initialized to DEFAULT_JOB, but that's easy for you to override with None or [] or whatever), but it's something we should do anyway. @simskij thoughts?