GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
888 stars 218 forks source link

Prometheus scraping fails for cnrm-resource-stats-recorder #618

Open jnodorp-jaconi opened 2 years ago

jnodorp-jaconi commented 2 years ago

Checklist

Bug Description

Prometheus scrapes of the cnrm-resource-stats-recorder fail with the following error message:

Get "http://x.x.x.x:8888/metrics": dial tcp x.x.x.x:8888: connect: connection refused

From my initial assessment it looks like there is a misunderstanding on how Prometheus scrapes annotated services. Prometheus does not scrape the service, but merely uses the service do discover the pods to scrape. Therefore the annotation must contain the pods port, not the service port (48797 instead of 8888).

See #615 for a fix.

Additional Diagnostic Information

Kubernetes Cluster Version

1.21

Config Connector Version

1.74.0

Config Connector Mode

cluster

Steps to Reproduce

Setup Prometheus to scrape annotated services, install config connector, wait for the probes to fail.

xiaobaitusi commented 2 years ago

Hi @jnodorp-jaconi, thanks for reporting the bug. We will look into it and work on a fix.

xiaobaitusi commented 2 years ago

I've submitted a change internally and the fix will be released within next few versions. Thanks!

rasmus commented 6 months ago

@xiaobaitusi we just hit the reverse problem with scraping the cnrm-resource-stats-recorder-service from within the cluster using vmagent.

The service exposes port 8888, but the annotation specifies that port 48797 should be used.

image

image

Editing the annotation manually to 8888 causes vmagent to correctly scrape the service on port 8888 instead.

image

rasmus commented 6 months ago

The documentation also refers to the service having the port correctly configured as 8888.

image

https://cloud.google.com/config-connector/docs/how-to/monitoring-prometheus