infinityworks / prom-conf

Prometheus config container
MIT License
7 stars 19 forks source link

How to get metrics by Rancher environment #2

Closed johnrengelman closed 7 years ago

johnrengelman commented 8 years ago

So I've deployed the Prometheus stack and I've updated the pieces as I see you've made some changes to host node discovery and such. Things are working pretty solid, but I'm failing to see how I could separate nodes based on Rancher environment. Not sure this is the best place for this comment, but the HostMetrics and ContainerMetrics show up with instance=<ip>:<port> and no other metadata AFAICT. I'm pretty new to Prometheus itself, so maybe there's just some basic item I'm missing.

Rucknar commented 8 years ago

I'm literally testing that stuff in my reference deployment now! Are you wanting to view nodes registered to the rancher server, but running in another 'rancher-environment'?

johnrengelman commented 8 years ago

Yeah. Exactly. We have multiple Rancher environments (independent clusters) that are all controlled by 1 Rancher Server. We use them to break apart both various teams and "environments" in the traditional sense (dev, qa, production, etc). So it would be great to be able to see cluster performance by Rancher environment - host and container stats.

johnrengelman commented 8 years ago

Also, if there's something I can help with, let me know. I can dig into and test some things.

Rucknar commented 8 years ago

Thanks @johnrengelman It's something that's on my list to add, but won't work 'out the box'. The DNS auto-discovery i suspect will only work within an environment (i'll check that).

I previously got hostnames using confd but that got the hosts from metadata services, the trouble with this is that the API keys used in that scenario limit you to a single environment. The cleanest way would be through DNS discovery, i'll have a dig to see if there is a way it can be done.

johnrengelman commented 8 years ago

You can create global API keys in Rancher. You have to do it via the API though (not available in the UI)(https://forums.rancher.com/t/api-key-for-all-environments/279)

I've done this for my use locally (I have single API key that I can use to run rancher-compose against multiple environments).

Anyway...I'm trying some changes to rancher-prometheus exporter now, but I'm guessing we'd need to wrap the node-exporter with something that would allow it to push a label for the metrics that corresponds to it's Rancher Environment.

Rucknar commented 8 years ago

Today I learned...

If that's the case, i have a confd template that will run through the hosts listed out by the rancher-prometheus-exporter. The exporter gets it's keys through the labels, i'll have a look to see if the global value can be used with labels.

Regarding node-exporter, i'm sort of wondering if we could get those stats out of cadvisor and do away with node exporter in the long term. In the cadvisor GUI, it exposes things like CPU cores etc.

johnrengelman commented 8 years ago

The other thing I'm having trouble grok'ing is how to handle customer Host labels from Rancher. So you can assign free-form labels to hosts in Rancher and use them for scheduling. Might be nice to somehow use that too.

Rucknar commented 8 years ago

In terms of how to get those labels into Prometheus for querying?

johnrengelman commented 8 years ago

Yeah.

Rucknar commented 8 years ago

Tricky one, i could see how it can be done if we pulled all that data from the API. Would need a bit of attention within the api-exporter but in theory that info is all there and could be used to populate the config. That being said, i suspect we'd lose DNS auto-discovery. hmm

johnrengelman commented 8 years ago

Actually, thinking about this more, the rancher-prometheus integration doesn't need to scan all environments, you could run a single copy in each environment that points to the same prometheus server. It's just that the data that it kicks out needs to include the Rancher environment name as part of it.

Rucknar commented 8 years ago

@johnrengelman Will look at getting that added in soon.

ampedandwired commented 8 years ago

+1 for this. I've ended up writing a small script that pulls host information from the Rancher API (using admin keys that can see all environments) and feeding it into Prometheus as using file_sd_configs. Not as pretty as DNS discovery, but seems to work OK.

Rucknar commented 8 years ago

@ampedandwired That's not actually a bad way of doing it. There is the exporter (https://github.com/infinityworksltd/prometheus-rancher-exporter) that gives some of that info.

If your sticking to the environment model in Rancher being what it is, Federation in Prometheus is probably the most obvious way. Though it's not easy to auto-discover additional environments.

Or again of course, you could utilise something like Consul...

marlinhares commented 7 years ago

@ampedandwired Please can you share the script and the path where I should run it? We want to use prometheus but just one for all our environments in Rancher.

ampedandwired commented 7 years ago

@marlinhares Sorry, not easily able to share the script because it's mixed in with a bunch of internal stuff that I can't release. Maybe someday I'll get a chance to separate it out. The basic idea is (pseudocode only):

prometheus_config = []
for rancher_env in rancher_api.get_environments():
  for host in rancher_api.get_hosts(rancher_env):
    ip_address = host["ipAddresses"][0]["address"]
    port = 9104  # CAdvisor port
    target = "#{ip_address}:#{port}"

    prometheus_config.append({
      "targets": [target],
      "labels": {
        "rancher_environment_name": rancher_env["name"]
      }
    })

if prometheus_config != current_prometheus_config():
  with open("/etc/rancher-prometheus/hosts.json") as f:
    f.write(json.dumps(prometheus_config))

This results in a list of entries like this in /etc/rancher-prometheus/hosts.json:

{
  "targets": [
    "10.0.12.6:9104"
  ],
  "labels": {
    "rancher_environment_name": "production"
  }
}

Obviously /etc/rancher-prometheus/ needs to be mounted as a volume on your prometheus container.

And then you can tell prometheus to watch the directory that the file is being written to:

  - job_name: 'rancher'
    file_sd_configs:
      - files: ['/etc/rancher-prometheus/*.json']

Finally I use "global" rancher keys (mentioned in a previous comment above) so I only need one instance of this container for the whole Rancher installation.

Anyway, hope this helps.