Stackdriver / stackdriver-prometheus

Prometheus support for Stackdriver
https://cloud.google.com/monitoring/kubernetes-engine/prometheus
Apache License 2.0
19 stars 12 forks source link

Cannot change the kind of metrics due to metric kind mismatch #21

Closed jcao219 closed 5 years ago

jcao219 commented 5 years ago

What did you do?

Originally, my application exposed a counter metric of some name.

I changed it to a gauge metric of the same name.

What did you expect to see?

The metric continues to work.

What did you see instead? Under which circumstances?

The logs of the prometheus pod show:

caller=queue_manager.go:568 component=remote msg="Unrecoverable error sending samples to remote storage" err="rpc error: code = InvalidArgument desc = One or more TimeSeries could not be written: Metric kind for metric external.googleapis.com/prometheus/my_metric must be CUMULATIVE, but is GAUGE.: timeSeries[10]"

The metric no longer receives any data.

Environment

GKE cluster.

Solution

  1. Used projects.metricDescriptors/delete API to delete the existing metric (I'm not sure if these step was needed)

  2. Restarted the stackdriver-prometheus application in every cluster

jkohen commented 5 years ago

Hi Jimmy, thanks for reporting! Using projects.metricDescriptors/delete API to delete the existing metric was the right fix. Restarting stackdriver-prometheus shouldn't be necessary.

Having to update the metric descriptor this way is a product decision by Stackdriver, to prevent accidentally losing data due to writes with incompatible descriptors. We realize this behavior is different from Prometheus.

I'll update our FAQ at https://cloud.google.com/monitoring/kubernetes-engine/prometheus soon.