canonical / grafana-agent-operator

This charmed operator automates the operational procedures of running Grafana Agent, an open-soruce telemetry collector.
https://charmhub.io/grafana-agent
Apache License 2.0
4 stars 8 forks source link

Tracing support for grafana agent #144

Open PietroPasotti opened 2 days ago

PietroPasotti commented 2 days ago

The goal is to configure grafana-agent in such a way that, when CMR'd with a Tempo backend (for example in cos-lite), grafana will forward any traces it receives to Tempo.

Testing Instructions

Deploy this bundle on a lxd model:

default-base: ubuntu@22.04/stable               
saas:                                           
  tempo1:                                       
    url: microk8s-localhost:admin/bar.tempo1    
applications:                                   
  gagent:                                       
    charm: local:grafana-agent-0                
    options:                                    
      tls_insecure_skip_verify: true            
  ubuntu:                                       
    charm: ubuntu                               
    channel: stable                             
    revision: 24                                
    num_units: 1                                
    to:                                         
    - "0"                                       
    constraints: arch=amd64                     
    storage:                                    
      block: loop,100M                          
      files: rootfs,100M                        
machines:                                       
  "0":                                          
    constraints: arch=amd64                     
relations:                                      
- - gagent:juju-info                            
  - ubuntu:juju-info                            
- - gagent:tracing                              
  - tempo1:tracing                              
--- # overlay.yaml                              
applications:                                   
  gagent:                                       
    offers:                                     
      gagent:                                   
        endpoints:                              
        - tracing                               
        acl:                                    
          admin: admin                          

And this bundle on a k8s model:

bundle: kubernetes
saas:
  remote-dfbd46d0396f448b8d12aee6ab922fae: {}
applications:
  alertmanager:
    charm: alertmanager-k8s
    channel: latest/edge
    revision: 125
    base: ubuntu@20.04/stable
    resources:
      alertmanager-image: 94
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
    trust: true
  catalogue:
    charm: catalogue-k8s
    channel: latest/edge
    revision: 59
    base: ubuntu@20.04/stable
    resources:
      catalogue-image: 33
    scale: 1
    options:
      description: "Canonical Observability Stack Lite, or COS Lite, is a light-weight,
        highly-integrated, \nJuju-based observability suite running on Kubernetes.\n"
      tagline: Model-driven Observability Stack deployed with a single command.
      title: Canonical Observability Stack
    constraints: arch=amd64
    trust: true
  grafana:
    charm: grafana-k8s
    channel: latest/edge
    revision: 117
    base: ubuntu@20.04/stable
    resources:
      grafana-image: 69
      litestream-image: 44
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,1024M
    trust: true
  loki:
    charm: loki-k8s
    channel: latest/edge
    revision: 160
    base: ubuntu@20.04/stable
    resources:
      loki-image: 98
      node-exporter-image: 2
    scale: 1
    constraints: arch=amd64
    storage:
      active-index-directory: kubernetes,1,1024M
      loki-chunks: kubernetes,1,1024M
    trust: true
  prometheus:
    charm: prometheus-k8s
    channel: latest/edge
    revision: 209
    base: ubuntu@20.04/stable
    resources:
      prometheus-image: 148
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,1024M
    trust: true
  tempo1:
    charm: tempo-k8s
    channel: latest/edge
    revision: 71
    resources:
      tempo-image: 16
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
  traefik:
    charm: traefik-k8s
    channel: latest/edge
    revision: 199
    base: ubuntu@20.04/stable
    resources:
      traefik-image: 160
    scale: 1
    constraints: arch=amd64
    storage:
      configurations: kubernetes,1,1024M
    trust: true
relations:
- - traefik:ingress-per-unit
  - prometheus:ingress
- - traefik:ingress-per-unit
  - loki:ingress
- - traefik:traefik-route
  - grafana:ingress
- - traefik:ingress
  - alertmanager:ingress
- - prometheus:alertmanager
  - alertmanager:alerting
- - grafana:grafana-source
  - prometheus:grafana-source
- - grafana:grafana-source
  - loki:grafana-source
- - grafana:grafana-source
  - alertmanager:grafana-source
- - loki:alertmanager
  - alertmanager:alerting
- - prometheus:metrics-endpoint
  - traefik:metrics-endpoint
- - prometheus:metrics-endpoint
  - alertmanager:self-metrics-endpoint
- - prometheus:metrics-endpoint
  - loki:metrics-endpoint
- - prometheus:metrics-endpoint
  - grafana:metrics-endpoint
- - grafana:grafana-dashboard
  - loki:grafana-dashboard
- - grafana:grafana-dashboard
  - prometheus:grafana-dashboard
- - grafana:grafana-dashboard
  - alertmanager:grafana-dashboard
- - catalogue:ingress
  - traefik:ingress
- - catalogue:catalogue
  - grafana:catalogue
- - catalogue:catalogue
  - prometheus:catalogue
- - catalogue:catalogue
  - alertmanager:catalogue
- - tempo1:ingress
  - traefik:traefik-route
- - tempo1:grafana-source
  - grafana:grafana-source
- - tempo1:logging
  - loki:logging
- - tempo1:tracing
  - remote-dfbd46d0396f448b8d12aee6ab922fae:tracing
--- # overlay.yaml
applications:
  tempo1:
    offers:
      tempo1:
        endpoints:
        - tracing
        acl:
          admin: admin

Then juju relate gagent tempo1

Now you can push any otlp/grpc traces you like to [gagent IP]:4317 and they should show up in Tempo.

TODO: [ ] - Verify TLS works [ ] - find out why the HTTP receiver on gagent isn't working (gives me a 404) [ ] - (grafana-agent): add tracing protocol interface to cos-agent lib: [ ] - requirer (principal charm) should request a list of protocols much like the tracing requirer [ ] - provider (gagent itself) should reply with all currently enabled protocols much like tempo does

[ ] - grafana-agent-k8s: implement tracing requirer and provider