GoogleCloudPlatform / opentelemetry-operations-go

Apache License 2.0
130 stars 100 forks source link

GMP exporter should convert Int values to Double #798

Closed dashpole closed 4 months ago

dashpole commented 8 months ago

GMP only accepts Double values. Sending it an int will cause the metric to be rejected. This is not correct. GMP accepts int and double values.

This can be done with the transform processor today, although I think this needs to be filtered so it only applies to int valued datapoints:

      transform/inttodouble:
        metric_statements:
          - context: datapoint
            statements:
              # GMP metrics are expected to be double
              - set(value_double, Double(value_int))

Ideally, the exporter should handle this, rather than requiring users to add processors. I encountered this when using the spanmetrics exporter, which produces int-valued metrics.

dashpole commented 6 months ago

This is tricky. If we do this in the exporter, existing users will encounter errors when they upgrade, since their existing metric descriptors are the wrong type. We would likely need to make this opt-in behavior.

aabmass commented 6 months ago

If GMP only supports doubles, how could their descriptors have the wrong type on upgrade?

dashpole commented 6 months ago

I think I must have been mistaken. There was a user that was able to write int metrics, and had problems switching to double.

dashpole commented 5 months ago

Confirmed that GMP accepts both int and double metrics

dashpole commented 5 months ago

In that case, we should probably not change the value type. If users have the same metric in both INT and Double, they can use the above workaround to merge them together.

dashpole commented 5 months ago

Actually, we should document this behavior on the exporter

olekgo commented 3 months ago

Commenting here since it seems very relevant but happy to open a new issue.

If users have the same metric in both INT and Double, they can use the above workaround to merge them together.

Is it possible to change the value type of a metric in GMP after it has been created?

I've tried deleting the metric descriptor via the API (doc) and then pushing the metric with a new type. This approach worked partially. I can now write the metric (previously, I was getting a Value type for metric prometheus.googleapis.com/container_memory_rss/gauge conflicts with the existing value type (INT64). error). However, I now encounter the following error when trying to query the metric:

An error occurred requesting data. bad_data: invalid parameter "query": substitute queries failed: convert vector selector failed: mismatching value types between inputs to the union: [DOUBLE INT64]

This has broken the metric in the whole Metric Scope - it can no longer be queried via GMP/PromQL (MQL seems to work)

dashpole commented 2 months ago

@olekgo i'm no expert in metrics scopes, but I suspect it is still an INT in a different project in the metrics scope? If you are able, I would open a bug/ticket with the standard GCP support channel.

olekgo commented 2 months ago

i'm no expert in metrics scopes, but I suspect it is still an INT in a different project in the metrics scope?

Correct. ~But I'm getting the error even when querying the metric from the project where it is stored (without using Scoping Project), so there shouldn't be a conflict with other projects.~ UPD: removing affected projects from the scope did make it possible to query the metric. I've already opened a ticket, I was just wondering if there's anything more that can be done. Will report the outcome of the ticket once it's solved.

olekgo commented 2 months ago

Update: The issue was a metrics collision - a conflict between the container_memory_rss and container.memory.rss metrics from different sources. The container_memory_rss metric was sourced from cAdvisor via the Prometheus receiver, while the container.memory.rss metric came from the kubeletstats receiver. The GMP exporter renamed container.memory.rss to container_memory_rss, causing a collision with the metric from cAdvisor.

This collision was difficult to troubleshoot because the metric from Prometheus was of type DOUBLE, whereas the metric from kubeletstats was of type INT. As a result, only one metric (the first to create the metrics descriptor) could be accepted, which prevented the production of typical logs associated with metrics collisions.

The issue was further complicated by the fact that the renaming done by GMP exporter was not immediately apparent. I used the file exporter (to see the raw metrics), which did not reflect the renaming since it occurs only in the GMP exporter. This made the renaming difficult to notice and added to the complexity of diagnosing the issue.

PS: I'm not 100% sure that GMP does the renaming, but I think I've seen it somewhere in the docs. Previously I had to rename metrics myself because I couln't query a metric containing . using PromQL, but looks like its now done by GMP.

dashpole commented 2 months ago

GMP does the renaming following the OTel-> Prometheus translation spec. Sorry for the difficulty and confusion.