Open rafal-dudek opened 2 years ago
Relates to #2232 / work originally done by @tmarszal, all the logic above can be derived from opentelemetry-operations-go written by Google itself (e.g. 1, 2)
Can we get some updates on this? Are there any plans to fix the Cumulative types for StackdriverMeterRegistry?
Hey @rafal-dudek ! I'm no Stackdriver expert, do you think you could write a test that would be failing with the current implementation? That would be easier to reason about I guess.
@marcingrzejszczak Here is a sample test: https://github.com/micrometer-metrics/micrometer/compare/main...rafal-dudek:micrometer:stackdriverCounterTest?expand=1 You may want to write it differently in your repo, but I hope it shows you the concept good enough.
I spotted 2 errors in curent implementation of CUMULATIVE metrics in StackdriverMeterRegistry. It is more of a design flaw than a bug.
1 ) StartTime and EndTime are set individually for each Batch, as separate extents (with 1ms delay between start and end time). https://github.com/micrometer-metrics/micrometer/blob/main/implementations/micrometer-registry-stackdriver/src/main/java/io/micrometer/stackdriver/StackdriverMeterRegistry.java#L313-L314
But as you can read here: https://cloud.google.com/monitoring/api/ref_v3/rest/v3/projects.metricDescriptors#MetricKind https://cloud.google.com/monitoring/mql/reference#time-series https://cloud.google.com/monitoring/api/ref_v3/rest/v3/TimeSeries#Point
Sending metric as it is in current implementation causes Stackdriver to detect restart of metric at each new point and causes problems with e.g. rate functions. StartTime should be probably stored for each metric separately as first publish time minus 'step'.
2 ) The second problem is value of CUMULATIVE metric. As StackdriverMeterRegistry is a StepMeterRegistry it uses StepCounter for counters, StepFunctionCounter for function counters etc. After each publish, such meter is reset. And as we can read in the documentation mentioned earlier:
To sum up, when config.useSemanticMetricTypes() is true we should have: