Stackdriver / stackdriver-prometheus-sidecar

A sidecar for the Prometheus server that can send metrics to Stackdriver.
https://cloud.google.com/monitoring/kubernetes-engine/prometheus
Apache License 2.0
120 stars 43 forks source link

Unrecovereable error when remote writing NGINX Ingress metrics #267

Open bittermandel opened 3 years ago

bittermandel commented 3 years ago

We have a basic installation of the sidecar together with the Prometheus Operator, and have configured the sidecar with the following flags:

args:
- --stackdriver.project-id=${PROJECT_ID}
- --prometheus.wal-directory=/prometheus/wal
- --stackdriver.kubernetes.location=${CLUSTER_REGION}
- --stackdriver.kubernetes.cluster-name=${CLUSTER_NAME}
- --include={__name__=~"nginx_.+"}

The metrics we are exporting are the default ones from https://github.com/kubernetes/ingress-nginx. This gives us the following error, about a few times per minute. This leads to no metrics being written to Stackdriver, while not giving a clear error.

Is there something we're missing when including metrics in that manner?

Thank you!

level=warn ts=2021-01-19T08:30:35.589Z caller=queue_manager.go:534 component=queue_manager msg="Unrecoverable error sending samples to remote storage" err="rpc error: co
de = InvalidArgument desc = Field timeSeries[36].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| does not have at least one entr
y."
level=warn ts=2021-01-19T08:30:35.622Z caller=queue_manager.go:534 component=queue_manager msg="Unrecoverable error sending samples to remote storage" err="rpc error: co
de = InvalidArgument desc = Field timeSeries[7].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| does not have at least one entry
."
level=warn ts=2021-01-19T08:30:35.651Z caller=queue_manager.go:534 component=queue_manager msg="Unrecoverable error sending samples to remote storage" err="rpc error: co
de = InvalidArgument desc = Field timeSeries[0].points[0].distributionValue had an invalid value: Distribution |explicit_buckets.bounds| does not have at least one entry
."
bittermandel commented 3 years ago

I have a feeling it is caused by the bucket metrics, which have over 10 labels. Unsuccessful to filter them out using --include{__name__!~".+bucket", __name__=~"nginx_.+"}

jsirianni commented 2 years ago

@bittermandel were you able to resolve this? I think I have a similar issue with the following args

    args:
    - "--stackdriver.project-id=<project>"
    - "--prometheus.wal-directory=/prometheus/wal"
    - "--stackdriver.kubernetes.location=us-east1-b"
    - "--stackdriver.kubernetes.cluster-name=<cluster name>"