hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.24k stars 4.41k forks source link

Improve documentation for topology metrics #9700

Open marvreichmann opened 3 years ago

marvreichmann commented 3 years ago

Old UI or New UI New UI, Consul 1.9.2

Describe the problem you're having I can not get the metrics to show up. I configured a Prometheus backend and added it as a metrics proxy in consul server config. Requests from the UI return with status code 200, but the the UI does not display any values. Prometheus is collecting values for the registered services.

Describe the solution you'd like Please update the documentation for this feature with an example configuration for Prometheus. Obviously there must be some setting I am missing. The example referenced for Kubernetes, does mention some necessary steps, but leaves out the prometheus configuration.

johncowen commented 3 years ago

Hey @marvreichmann

Thanks for raising this. Whilst I can't help immediately with documentation updates, I may be able to help from a frontend perspective.

Just incase, have you seen this learn guide?

https://learn.hashicorp.com/tutorials/consul/kubernetes-layer7-observability

It describes a full deployment using k6 for traffic simulation in order to observe the traffic in the topology panel for a service in the UI. Hopefully that helps?

marvreichmann commented 3 years ago

hey @johncowen

Thanks for your answer.

I did in fact work through the guide you mentioned. I needed to take some different steps as I am working with Nomad rather than Kubernetes, but the core consul configs shouldn't differ to much. Except there is something fundamentally different about the way nomad spawns the envoy proxy containers for consul connect, that I am not aware of.

I suspect the config gap to be somewhere between consul and prometheus, as i can see the metrics in prometheus, but the ui cant find them. I tried debugging my way through by examining the http requests directed at the metrics proxy. Those looked fine to me, when comparing the queried metrics to the ones assembled in prometheus.

Needless to say, I'm rather clueless where to look next.

johncowen commented 3 years ago

Is there any configuration you can share? Then I can maybe have a chat with folks here and see if we can help you further

marvreichmann commented 3 years ago
## Prometheus Config
global:
  scrape_interval:     5s

  - job_name: 'consul_services'
    metrics_path: "/metrics"

    consul_sd_configs:
      - server: '{{ env "NOMAD_IP_prometheus_ui" }}:8500'
        datacenter: 'dc1'

    relabel_configs:

      - source_labels: [__meta_consul_tags]
        regex: .*,prometheus,.*
        action: keep

      - source_labels: [__meta_consul_service]
        target_label: job

      - source_labels: ['__address__']
        separator:     ':'
        regex:         '(.*):(.*)'
        target_label:  '__address__'
        replacement:   '${1}:9102'
marvreichmann commented 3 years ago

Sidecar config from Nomad Job File

service {
  name = "service-name"
  port = 80

  connect {
    sidecar_service {
      port = "monitoring"

      proxy {
        config {
          envoy_prometheus_bind_addr = "0.0.0.0:9102"
          protocol                   = "http"
        }
      }
    }
  }
marvreichmann commented 3 years ago

Consul Server Config

telemetry {
  prometheus_retention_time = "30s"
}

ui_config {
  enabled = true
  metrics_provider = "prometheus"
  metrics_proxy {
    base_url = "http://10.10.10.212:9090"
  }
}

connect {
  enabled = true
}
sri4kanne commented 3 years ago

@marvreichmann were you able to get this working? i'm having issue setting this up as well for metrics. We are running a 5 node cluster on VM's and we are currently on latest consul version (1.10.1). Below is the related config from one of the nodes and all the nodes have same config.

    "telemetry": {
        "disable_hostname": true,
        "prometheus_retention_time": "30s"
    },
    "ui_config": {
      "enabled": true,
      "metrics_provider": "prometheus",
      "metrics_proxy": {
        "base_url": "http://promhostdev111:80"
      }
    },

And below is the snippet of config from prometheus.

- job_name: consul
  honor_timestamps: true
  params:
    format:
    - prometheus
  scrape_interval: 15s
  scrape_timeout: 15s
  metrics_path: /v1/agent/metrics
  scheme: http
  consul_sd_configs:
  - server: consul.service.dev-dc.consul:8500
    tag_separator: ','
    scheme: http
    allow_stale: true
    refresh_interval: 30s
    services:
    - consul
  relabel_configs:
  - source_labels: [__meta_consul_node]
    separator: ;
    regex: (.*)
    target_label: __address__
    replacement: ${1}:8500
    action: replace
  - source_labels: [__meta_consul_node]
    separator: ;
    regex: (.*)
    target_label: instance
    replacement: $1
    action: replace