grafana / tempo

Grafana Tempo is a high volume, minimal dependency distributed tracing backend.
https://grafana.com/oss/tempo/
GNU Affero General Public License v3.0
3.76k stars 488 forks source link

Metrics Generator:: Include server/server spans as well #3414

Open dkrizic opened 4 months ago

dkrizic commented 4 months ago

Is your feature request related to a problem? Please describe.

I have an ingress-nginx that emits spans of type SERVER and calls a backend where the first span is of type SERVER as well. The Metrics Generator does not create a connection between ingress-nginx. I presume that is due to the fact that the Metrics Generator only evaluates CLIENT->SERVER spans.

Describe the solution you'd like

A span of type SERVER->SERVER should be handled identically to CLIENT->SERVER

Describe alternatives you've considered

An alternative would be to change the ingress-nginx OTEL integration to emit two spans, first a SERVER span and a child span of type CLIENT.

joe-elliott commented 3 months ago

Agree this would be a nice improvement to the generator. It might be difficult to pull off because the current code uses the client/server labels to track edges:

https://github.com/grafana/tempo/blob/main/modules/generator/processor/servicegraphs/servicegraphs.go#L174

One option would be to track all spans regardless of the span kind and note any cross service calls, but obviously this would be more costly. It's also tough b/c if a server->server edge is found we still need to keep the edge around in case the matching client->server shows up. There are some complexities but this does happen in the real world and it would be nice to handle it.

Does client->client ever occur?

msvechla commented 2 months ago

Hi, I stumbled across this issue. Is it true that tempo metrics generator only considers CLIENT->SERVER spans? Does it also consider SERVER->CLIENT spans?

Because I am currently comparing tempo metrics generator with the otel metrics generator and it looks like SERVER->CLIENT span metrics are missing

dkrizic commented 2 months ago

Hi @msvechla, I did not test it, but according to their documentation, they only consider CLIENT->SERVER and PRODUCER->CONSUMER spans. Others do the same, e.g. Datadog or Azure Monitor. I am also talking to the ingress-nginx guys to create a second span of type CLIENT in order to complete my diagram. https://github.com/kubernetes/ingress-nginx/issues/11002#