Closed nlowe closed 3 years ago
Hi @nlowe, thanks for the report.
Let me first quickly explain that we have 2 use cases while scraping services:
Your proposal fixes point 2 while breaking point 1 though.
I could manually update the target list every time a new pod gets deployed but that's not exactly ideal.
Can you speak a bit more about your scenario? Are you working with a {Daemon|Stateful|Replica}Set? In case you are working with any of these resources, the recommendation is that you add the scrape label to the pod template to automatically have all the created pods labelled.
For example, have a look at this kube-dns deployment template from the official Kubernetes repo: https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/kube-dns/kube-dns.yaml.in.
Ah, I guess that makes sense. I guess I was expecting nri-prometheus to be able to mirror my existing scrape configs.
For applications we deploy moving the scrape annotation is simple enough. It's a little more involved for things that are deployed for us by our cloud provider but should be manageable. I suppose I could also scrape the prometheus federated endpoint as well and let prometheus do discovery.
I'm not familiar with an application that can provide all metrics from a single endpoint, I would imagine use case 2 is far more common. Would you consider accepting a pull request to add support for optionally scraping endpoints instead?
Would you consider accepting a pull request to add support for optionally scraping endpoints instead?
This is definitely something we're looking forward to support, so yes.
How do you think this should work? 🤔
My first thought was to add an option to select which discovery mode you wanted to the configuration file, but that's not super flexible if you have a need to use both discovery modes concurrently. It might make more sense to expand the existing targets
option to be closer to the scrape_configs
section in prometheus. Currently it only supports specifying a static list of targets, but it would be cool if I could do something like this:
targets:
- description: Scrape services with my.domain.io/custom = foo directly
service:
match:
- prometheus.io/scrape: "true"
- my.domain.io/custom: foo
- description: Service Endpoints for everything else
endpoints:
match:
- prometheus.io/scrape: "true"
not_match:
- my.domain.io/custom: foo
- description: Secure etcd example
static:
urls: ["https://192.168.3.1:2379", "https://192.168.3.2:2379", "https://192.168.3.3:2379"]
tls_config:
ca_file_path: "/etc/etcd/etcd-client-ca.crt"
cert_file_path: "/etc/etcd/etcd-client.crt"
key_file_path: "/etc/etcd/etcd-client.key"
This would open up the door to potentially providing other service discovery methods as well, however this would be a pretty big breaking change.
Perhaps there's a way to have nri-prometheus piggy-back off of the existing prometheus discovery code?
Your ideas make sense and are the same ones we were already investigating internally. We even have a prototype reusing the Prometheus discovery system. This is advantageous for many reasons.
Let me discuss this with our PM and get back to you. As like you said, it's indeed a very big change (both technically and for customers). 👍
Hey @nlowe, I have an update from our PM: improved auto-discovery inspired on the one from the official Prometheus client is not planned for this quarter, unfortunately. It's in our backlog though and we all consider it has a high value.
Hi @douglascamata , is there any plan to get service discovery implemented anytime soon? It would be really valuable for some of our use cases. Thanks
Bumping this one again 👍
Hi @nlowe, thanks for the report.
Let me first quickly explain that we have 2 use cases while scraping services:
- Where a single endpoint behind the service can provide all the metrics;
- Where all the endpoints behind the service need to scraped for metric collection to be considered complete.
Your proposal fixes point 2 while breaking point 1 though.
I could manually update the target list every time a new pod gets deployed but that's not exactly ideal.
Can you speak a bit more about your scenario? Are you working with a {Daemon|Stateful|Replica}Set? In case you are working with any of these resources, the recommendation is that you add the scrape label to the pod template to automatically have all the created pods labelled.
For example, have a look at this kube-dns deployment template from the official Kubernetes repo: https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/kube-dns/kube-dns.yaml.in.
Hi @douglascamata, thank you for your message. What you described in "1. Where a single endpoint behind the service can provide all the metrics;" is what I am trying to achieve.
The metric of all the pods replies the same value, so I don't need Prometheus to scrape all of them, only the service is already good. Is it really possible? I could not really find how to do it.
If you add the prometheus label on the service nri-prometheus will discover such service and scrape it.
Therefore if you have the labels on the service and there is only on pod exposed by that service you are sure that that pod going to be the one scraped.
This service for example is configured to be recognised and scraped by nri-prometheus
@brenwhyte @brianhuangyl unfortunately I am not currently working in the team that maintain this project. I know though that auto-discovery inspired on Prometheus is not on the roadmap for this.
An idea: New Relic now can ingest data through the remote_write
feature of the official Prometheus client: https://docs.newrelic.com/docs/integrations/prometheus-integrations/install-configure-remote-write/set-your-prometheus-remote-write-integration. This way you can keep using the Prometheus discovery that you know how to use and still get your data ingested.
The fix was merged, we will need to release the new fix/feature, document it and release the chart
While services with the a
prometheus.io/scrape
annotation can be discovered, it appears thatnri-prometheus
scrapes the service itself and not service endpoints:https://github.com/newrelic/nri-prometheus/blob/d8a8f1b1c4ff34edc8d78ceecb422cc63b29e163/internal/pkg/endpoints/kubernetes.go#L178-L192
Usually you want the service endpoints to be discovered instead. This is useful for applications like kube/coredns that only expose the application port (
53
) on the service and not the metrics port (9153
):In the current configuration,
nri-prometheus
cannot scrape metrics from kube-dns because the port indicated byprometheus.io/port
isn't exposed on the service:This also presents a problem for applications that do expose metrics on the same port as the service:
nri-prometheus
won't have a complete view of the application because it'll only scrape a single pod each scrape interval instead of all pods that back the service.I could manually update the target list every time a new pod gets deployed but that's not exactly ideal.