hertz-contrib / obs-opentelemetry

Opentelemetry for Hertz
Apache License 2.0
24 stars 26 forks source link

use spanmetricsprocessor alike to generate metrics from span #17

Closed futurist closed 1 year ago

futurist commented 1 year ago

Is your feature request related to a problem? Please describe.

Currently the metrics is hard coded into middleware, in below section:

https://github.com/hertz-contrib/obs-opentelemetry/blob/f719a52136f0dd2dfdcc03f9396bc5c7fbfd7c59/tracing/middleware.go#L88

It's not good for extension, and a open-telemetry collector processor can be a good way for the job.

Describe the solution you'd like

Use spanmetricsprocessor or create a new processor for hertz to extract metrics.

CoderPoet commented 1 year ago

@futurist Very good question, here are some of my considerations

There are some problems with the community's spanmetrics processor:

  1. The generated metrics cannot be used to draw topology, only convert metrics based on a single span, rather than correlating upstream and downstream spans resource.service_name semantics
  2. overhead is too concentrated in collector, the performance of high QPS scenarios is slightly poor
  3. the dimension cache mechanism is easy to cause span loss, which ultimately leads to inaccurate metrics
CoderPoet commented 1 year ago

@futurist Very good question, here are some of my considerations

There are some problems with the community's spanmetrics processor:

  1. The generated metrics cannot be used to draw topology, only convert metrics based on a single span, rather than correlating upstream and downstream spans resource.service_name semantics
  2. overhead is too concentrated in collector, the performance of high QPS scenarios is slightly poor
  3. the dimension cache mechanism is easy to cause span loss, which ultimately leads to inaccurate metrics

Of course, on the other hand, the metrics instrumentation on the SDK side can be considered to make a pluggable capability

CoderPoet commented 1 year ago

@futurist Very good question, here are some of my considerations

There are some problems with the community's spanmetrics processor:

  1. The generated metrics cannot be used to draw topology, only convert metrics based on a single span, rather than correlating upstream and downstream spans resource.service_name semantics
  2. overhead is too concentrated in collector, the performance of high QPS scenarios is slightly poor
  3. the dimension cache mechanism is easy to cause span loss, which ultimately leads to inaccurate metrics
  1. It also limits the possibility of SDK-side sampling, and full sampling must be required to accurately calculate the metrics