hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.2k stars 4.42k forks source link

Unable to load metrics in topology view #12403

Open nahsi opened 2 years ago

nahsi commented 2 years ago

Overview of the Issue

Cannot setup metrics in topology view. Configuration:

  ui_config:
    enabled: true
    metrics_provider: "prometheus"
    metrics_proxy:
      base_url: "https://victoria-metrics.service.consul"
    dashboard_url_templates:
      service: !unsafe "https://grafana.service.consul/d/{{Service.Meta.dashboard}}"

When I enable debug logs and click on generated queries from the log it works fine. For example:

> curl -sS 'https://victoria-metrics.service.consul/api/v1/query?query=histogram_quantile(0.5%2C%20sum%20by(le%2Cconsul_source_service%2Cconsul_source_datacenter%2Cconsul_source_namespace)%20(rate(envoy_cluster_upstream_rq_time_bucket%7Bconsul_destination_service%3D%22victoria-metrics%22%2Cconsul_destination_datacenter%3D%22oikumene%22%2Cconsul_destination_namespace%3D%22default%22%7D%5B15m%5D)))&time=1645379365.739' | jq
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "consul_source_datacenter": "oikumene",
          "consul_source_namespace": "default",
          "consul_source_service": "vmagent"
        },
        "value": [
          1645379365.739,
          "20.937829694377903"
        ]
      }
    ]
  }
}

Only unavailable from the consul ui.

Reproduction Steps

  1. Visit topology view

Describe the solution you'd like

Metrics to be displayed in topology UI

Consul Version

1.11.3

Browser and Operating system details

chromium, gentoo, amd64 metrics database is VictoriaMetrics

Screengrabs / Web Inspector logs

image image image

nahsi commented 2 years ago

It is highly possible that this is misconfiguration on my side. Although I spent a lot of time trying to understand what is wrong with no avail.

blake commented 2 years ago

Hi @nahsi,

 ui_config:
   enabled: true
   metrics_provider: "prometheus"
   metrics_proxy:
     base_url: "https://victoria-metrics.service.consul"
   dashboard_url_templates:
     service: !unsafe "https://grafana.service.consul/d/{{Service.Meta.dashboard}}"

This configuration structure is only valid when providing the UI configuration to the Consul agent via HCL or JSON–the latter of which can be provided using the server.extraConfig or client.extraConfig config options.

Since you're configuring this as YAML, I assume that you're trying to configure this for a Consul cluster that is deployed on Kubernetes. If so, you can use a slightly altered configuration that will be correctly deployed by the Helm chart. Try enabling the metrics proxy by adding the following snippet to your Helm values file.

ui:
  metrics:
    enabled: true
    provider: prometheus
    baseURL: https://victoria-metrics.service.consul
  dashboardURLTemplates:
    service: !unsafe "https://grafana.service.consul/d/{{Service.Meta.dashboard}}"

These configuration parameters are documented in the Helm chart's docs at https://www.consul.io/docs/k8s/helm#v-ui-metrics.

I noticed that https://www.consul.io/docs/connect/observability/ui-visualization does not correctly document how to configure these settings when using the Helm chart. I'll create a PR to address this.

nahsi commented 2 years ago

Hi @blake, hi @johncowen

I'm not running Consul in K8S, running it on bare metal. It's just I'm configuring Consul with Ansible and store configuration in Ansible inventory in yaml, then copy to host with to_nice_json filter:

- name: create consul config
  tags: config
  copy:
    content: "{{ consul_config | to_nice_json }}"
    dest: "{{ consul_dirs.main.path }}/consul.json"
    owner: "{{ consul_user }}"
    group: "{{ consul_group }}"
    mode: 0640
    validate: "consul validate -config-format=json %s"
  notify: restart consul

Sorry for the confusion, I'm so very used to dealing with Consul/Nomad/Vault configs in yaml that I assume everyone do the same.

This is how the resulting json looks like on host:

    "ui_config": {
        "dashboard_url_templates": {
            "service": "https://grafana.service.consul/d/{{Service.Meta.dashboard}}"
        },
        "enabled": true,
        "metrics_provider": "prometheus",
        "metrics_proxy": {
            "base_url": "https://victoria-metrics.service.consul"
        }

In the documentation it was pretty clear to me that configuration of Consul in K8S and outside uses different options.

nahsi commented 2 years ago

@blake sorry, can you reopen this ticket?

blake commented 2 years ago

My apologies. It closed when the PR was merged, but before we had resolved this conversation.

Have you restarting the agent after making this configuration change to enable the metrics proxy? If so, are you still seeing the 404 errors?

nahsi commented 2 years ago

@blake yes I did, it did't help.

I will provide more info and do some testing soon, please don't close this issue :)

tristanmorgan commented 9 months ago

This may be related, a strange thing is it seems Consul is not updating the "Host:" header when making the query, so my load-balancer gets the prom-queries it sends them straight back to consul and returning a 404.

n6g7 commented 7 months ago

@tristanmorgan I ran into this one too and found this PR: https://github.com/hashicorp/consul/pull/13071 tl;dr: you can now use the add_headers configuration option to force set the Host header, that fixed the issue for me.