Closed finda-yeongjo closed 2 months ago
Hi! Service graphs have a number of ways of identifying communication between services—for Tempo they're described in the docs. Connections not necessarily need represent HTTP requests.
* A request across a messaging system where the outgoing and the incoming span must have `span.kind`, `producer`, and `consumer` respectively.
This is what's identifying a connection between the two services.
Hey @mapno Your answer was fantastic. I have perfectly removed the problematic parts from the dashboard and various graphs using Tempo as a data source. I blame myself for not carefully reading the docs.
May I ask one more question?
When specifying span_kind
, there is no data (span_kind_consumer
, producer
, server
, client
and unspecified
). Is there any additional configuration needed? Simply setting connection_type
=messaging_system
shows all servers communicating through MSK
I am using auto-instrumentation because I cannot enforce spans on all technical teams, which makes it difficult for me to directly control headers, span kinds, IDs and etc....
Hey! span_kind
is not a label of service graph metrics (it's set on span-metrics though). I'm not sure if it'd make sense to add it in the first place, since it's implicit by the connection type—ie. if connection_type
is messaging_system
, the spans must have had kind
consumer
and producer
.
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.
Describe the bug When monitoring Java applications using OpenTelemetry Java Auto-Instrumentation, the trace data incorrectly shows service A calling service B (A -> B), even though there is no actual call between A to B. Based on the concept of microservices, A and B are producing and consuming data through a Kafka topic, maintaining "loose coupling" between each other. This issue is evident in the
traces_service_graph_request_total
metric and the Zipkin trace data, which suggests a relationship that does not exist.If I enable the following two options in the OpenTelemetry instrumentation, Service A changes to "user," but the data is still identified.
I initially raised this issue with the OpenTelemetry team, but their response suggested raising the issue with Grafana and Tempo instead. Ref. https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/11348
To Reproduce Steps to reproduce the behavior:
Expected behavior The trace data and metrics should accurately reflect the interactions between services. Specifically, no traces or metrics should suggest a direct interaction between service A and service B when there is none.
Environment:
Additional Context All services are deployed as pods in EKS. The issue persists even after verifying that there are no overlapping or contaminated headers and that Trace IDs are unique and correctly configured. The environment configuration for OpenTelemetry instrumentation includes settings for exporting to Prometheus and Zipkin, capturing content-type headers for HTTP requests and responses.
Service A JDK: Amazon Corretto 17 Spring: 2.7.1 OS: Amazon Linux (EKS)
Service B JDK: Amazon Corretto 17 Spring: 3.0.5 OS: Amazon Linux (EKS)