elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.61k stars 8.22k forks source link

Open Telemetry span links are wrong direction #183973

Open MeijerM1 opened 5 months ago

MeijerM1 commented 5 months ago

APM Server version (apm-server version): 8.12.1

Description of the problem including expected versus actual behavior: When creating span links via the OTLP protocol the links are in the wrong direction. E.i. the producing side has incoming links while the consuming side has outgoing links. For us it would make more sense if the producing side had outgoing links and the consuming side had incoming links.

Steps to reproduce:

  1. Create two applications (in my case .NET), a producer and a consumer.
  2. Connect the two application via any service bus implementation
  3. Instrument both applications using Open Telemetry and export the data via the OTLP protocol directly to the APM server.
  4. Make sure the event contains the traceparent header
  5. In the consumer application create a span link based on the traceparent header
  6. Find the corresponding transactions in Kibana and observe that the producer application has an incoming link while the consumer has an outgoing link.

Provide logs (if relevant): N/A

axw commented 5 months ago

@MeijerM1 this doesn't sound like an APM Server problem. APM Server processes each event independently. If links are being shown in the wrong place, then it's more likely a problem with either the instrumentation or visualisation.

Would you be able to share the documents for one trace? And a screenshot of Kibana to illustrate where things look wrong?

MeijerM1 commented 5 months ago

@axw I understand what you mean but the Open Telemetry spec doesn’t have an option to specify link direction like the APM spec has so there is no way to fix it at the instrumentation side.

I’ll see if I can get some data for you later today.

axw commented 5 months ago

Hmm, Elastic APM agents don't specify a direction on links either: https://github.com/elastic/apm/blob/main/specs/agents/span-links.md. Maybe this is a UI thing based on some heuristics - not sure. Anyway, we'll see.

MeijerM1 commented 5 months ago

Some data as requested

The side is the producing span:

{
  "@timestamp": [
    "2024-05-21T11:11:51.333Z"
  ],
  "agent.name": [
    "opentelemetry/dotnet"
  ],
  "agent.version": [
    "1.8.0"
  ],
  "data_stream.dataset": [
    "apm"
  ],
  "data_stream.namespace": [
    "default"
  ],
  "data_stream.type": [
    "traces"
  ],
  "event.agent_id_status": [
    "missing"
  ],
  "event.ingested": [
    "2024-05-21T11:11:54.000Z"
  ],
  "event.outcome": [
    "unknown"
  ],
  "labels.team": [
    "Neo"
  ],
  "observer.hostname": [
    "7ea80461f4bf"
  ],
  "observer.type": [
    "apm-server"
  ],
  "observer.version": [
    "8.12.1"
  ],
  "parent.id": [
    "0732d162d658b50b"
  ],
  "processor.event": [
    "span"
  ],
  "service.language.name": [
    "dotnet"
  ],
  "service.name": [
    "MyService"
  ],
  "service.node.name": [
    "AW0SDWK00000F"
  ],
  "service.version": [
    "1.0.0.0"
  ],
  "span.duration.us": [
    934
  ],
  "span.id": [
    "ae64e75fd492038f"
  ],
  "span.name": [
    "publish MyEvent"
  ],
  "span.representative_count": [
    1
  ],
  "span.type": [
    "unknown"
  ],
  "timestamp.us": [
    1716289911333246
  ],
  "trace.id": [
    "198345c15b0a406b5ecc549218a41d2e"
  ],
  "_id": "Gg7Zmo8BX5usOMpiyCaA",
  "_index": ".ds-traces-apm-default-2024.05.21-000328",
  "_score": null
}

The consumer transaction:

{
  "@timestamp": [
    "2024-05-21T11:11:51.424Z"
  ],
  "agent.name": [
    "opentelemetry/dotnet"
  ],
  "agent.version": [
    "1.8.0"
  ],
  "data_stream.dataset": [
    "apm"
  ],
  "data_stream.namespace": [
    "default"
  ],
  "data_stream.type": [
    "traces"
  ],
  "event.agent_id_status": [
    "missing"
  ],
  "event.ingested": [
    "2024-05-21T11:11:52.000Z"
  ],
  "event.outcome": [
    "success"
  ],
  "event.success_count": [
    1
  ],
  "labels.invocationId": [
    "2d91ae09-cc18-4a06-83c9-443a423de158"
  ],
  "labels.team": [
    "Neo"
  ],
  "observer.hostname": [
    "7ea80461f4bf"
  ],
  "observer.type": [
    "apm-server"
  ],
  "observer.version": [
    "8.12.1"
  ],
  "processor.event": [
    "transaction"
  ],
  "service.framework.name": [
    "framework"
  ],
  "service.language.name": [
    "dotnet"
  ],
  "service.name": [
    "MyService"
  ],
  "service.node.name": [
    "AW0SDWK000003"
  ],
  "service.version": [
    "1.0.0.0"
  ],
  "span.id": [
    "d62f731166f3e73e"
  ],
  "span.links.span.id": [
    "ae64e75fd492038f"
  ],
  "span.links.trace.id": [
    "198345c15b0a406b5ecc549218a41d2e"
  ],
  "timestamp.us": [
    1716289911424047
  ],
  "trace.id": [
    "146e102a126906cd3111b1aa1b03b954"
  ],
  "transaction.duration.us": [
    214700
  ],
  "transaction.id": [
    "d62f731166f3e73e"
  ],
  "transaction.name": [
    "ProcessXX"
  ],
  "transaction.name.text": [
    "ProcessXX"
  ],
  "transaction.representative_count": [
    1
  ],
  "transaction.result": [
    "Success"
  ],
  "transaction.sampled": [
    true
  ],
  "transaction.type": [
    "unknown"
  ],
  "_id": "N6_Zmo8B2aDBwvL8v_n5",
  "_index": ".ds-traces-apm-default-2024.05.21-000328",
  "_score": null
}

Screenshot from the producing side, notice the incoming link that I expect to be outgoing. image psd

axw commented 5 months ago

Thanks @MeijerM1, this is indeed a UI issue. I don't know the details of how the link direction is inferred - moving it over to the Kibana repo to get the UI folks to take a look.

elasticmachine commented 5 months ago

Pinging @elastic/apm-ui (Team:APM)

elasticmachine commented 5 months ago

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)