QubitProducts / exporter_exporter

A reverse proxy designed for Prometheus exporters
Apache License 2.0
338 stars 55 forks source link

Added discovery feature #65

Closed jonaz closed 1 year ago

jonaz commented 3 years ago

Used to automaticly discover enabled modules on localhost Will be used together with https://github.com/FortnoxAB/prometheus-net-discovery

The idea is to dynamicly discover modules based on what's running on the machine at the moment. So we don't have to have another automation software reconfigure exporter_exporter when exporters are added or removed on a VM. We deploy some software using kubernetes (with hostNetwork) and some with ansible.

Now all those services can be automaticly discovered by exporter_exporter and exposed through the JSON api on port 9999 for prometheus-net-discovery to configure prometheus for scraping.

Our expexp.yaml now looks like this:

discovery: 
  enabled: true
  exporters: 
    node:
      port: 9100
    minio:
      port: 9091
      path: "http://%s/minio/prometheus/metrics"
    elasticsearch:
      port: 9114
    haproxy:
      port: 9101
    mysql:
      port: 9104
    nginx:
      port: 9113
    redis:
      port: 9121
    memcached:
      port: 9150
    postfix:
      port: 9154
    postgres:
      port: 9187
    pgbouncer:
      port: 9188
    barman:
      port: 9189
    php-fpm:
      port: 9253
    kafka:
      port: 9308
    389ds:
      port: 9496
    imap-mailbox:
      port: 9893
    etcd:
      port: 2379
      path: "http://%s/metrics"

Then we have added https://github.com/FortnoxAB/prometheus-net-discovery/blob/master/main.go#L348 So we automaticly configure targets like this for discovered hosts. Then we make sure we also have a job named node (and all others in the list above) configured in prometheus.

/etc/prometheus/file_sd/node.json

[
        {
                "targets": [
                        "10.0.0.104:9999"
                ],
                "labels": {
                        "__metrics_path__": "/proxy",
                        "__param_module": "node",
                        "host": "k8s-dev001-worker002.asdf.com",
                        "subnet": "10.0.0.0/24"
                }
        },
.....

We have been running this in production on 24 k8s clusters and about 500 VMs for 2 months. Has worked really well and removed alot of manual configuration when services are added.

I also added some refactoring and optimizations on code, fixed some race conditions aswell as some autoformatting to make comments etc follow idiomatic go.

jonaz commented 3 years ago

closing. will merge to our master fork first and then open new PR!

tcolgate commented 3 years ago

Is the idea here to pull content from another service? My instinct is that it would be better to just get SIGHUP hadning for rereading config working, then just rely on config management. That feels like a more common workflow.

jonaz commented 3 years ago

Nope the idea is not to pull from another service. The idea is to dynamicly discover modules based on what's running on the machine at the moment. So we don't have to have another automation software reconfigure exporter_exporter when exporters are added or removed on a VM.

I'll explain in more detail in the final PR next week when we have tested this in production on around 500 VMs.

We have already used this "discovery pattern" in production for 3 years on 800 VMs But going directly to exporters and not though the awesome exporter_exporter proxy :)

tcolgate commented 3 years ago

Okay, I'll wait for the final PR. Thanks.

jonaz commented 3 years ago

This is now open for review! I'll edit my first post in a minute.

tcolgate commented 3 years ago

I should warn you that it is going to be a while before I can get round to giving this a proper review. We're busy on internal projects at the moment, and this is quite a significant change that I want to fully understand (and not a change that we need ourselves)

tcolgate commented 1 year ago

Sorry, I realise this is a ridiculously long time since you opened this PR. I've just taken a look, it looks good, though I don't know how popular the related prom discovery tools really are. That said, I'd like to get one other PR merged, then I'll look at rebasing and merging this. Assuming you are still using it?

jonaz commented 1 year ago

@tcolgate yes we still use this to discover exporters on about 1500 VMs.

But our master branch is currently more up to date with removal of verify and optimized stuff so its about 22 times faster by initial benchmarks. https://github.com/FortnoxAB/exporter_exporter/pull/6

currently master also contains stuff specific to us. But feature/discovery should be OK!

There is no hurry our fork runs just fine :D

tcolgate commented 1 year ago

If you are happy maintaining your fork, I may just close this. It does feel like a very niche case, and I think the majority of people will be happy with something like ansible doing the smarts here.

jonaz commented 1 year ago

@tcolgate until they start using 30+ k8s clusters with exporters as daemonsets ;)