jaegertracing / jaeger-ui

Web UI for Jaeger
http://jaegertracing.io/
Apache License 2.0
1.14k stars 482 forks source link

[Bug]: flamegraph is confused by some traces #1012

Open bobrik opened 1 year ago

bobrik commented 1 year ago

What happened?

Ad a Jaeger user, I want to see representative flamegraph representation of a trace. Unfortunately, it is misleading.

Steps to reproduce

  1. Load the following json:
bug.json ```json { "data": [ { "traceID": "0014a21066967d7c", "spans": [ { "traceID": "0014a21066967d7c", "spanID": "ad55f46b6e407452", "flags": 3, "operationName": "request_logic", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "d523c4890747823a" } ], "startTime": 1666977689604828, "duration": 43187, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "c27b2c5411baf3b6", "flags": 3, "operationName": "rewrite", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "56cd2ecf981d0918" } ], "startTime": 1666977689712349, "duration": 38771, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "d523c4890747823a", "flags": 3, "operationName": "request", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "05bd7c4d934a00f0" } ], "startTime": 1666977689604569, "duration": 161560, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "c3cd43544ecbc87c", "flags": 3, "operationName": "fetch", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "7d521297961cdc67" } ], "startTime": 1666977689710544, "duration": 53289, "tags": [], "logs": [], "processID": "p4", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "38ce0caa9e4b3d22", "flags": 3, "operationName": "upstream_ttfb", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "206bd72c0bbf060d" } ], "startTime": 1666977689751315, "duration": 9377, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "7f70e28c238aa159", "flags": 3, "operationName": "upstream_ttfb", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "1ccad7dc6c81c815" } ], "startTime": 1666977689650676, "duration": 112750, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "021209054d6c3f1b", "flags": 3, "operationName": "request", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "206bd72c0bbf060d" } ], "startTime": 1666977689752371, "duration": 2968, "tags": [], "logs": [], "processID": "p3", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "7d521297961cdc67", "flags": 3, "operationName": "request_context", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "c9179afc072b179d" } ], "startTime": 1666977689652183, "duration": 111950, "tags": [], "logs": [], "processID": "p4", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "0a1a3ac8a681da9c", "flags": 3, "operationName": "http_request", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "1ccad7dc6c81c815" } ], "startTime": 1666977689650841, "duration": 112685, "tags": [], "logs": [], "processID": "p4", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "abfa22088cbfe8dc", "flags": 3, "operationName": "request_logic", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "c27b2c5411baf3b6" } ], "startTime": 1666977689712395, "duration": 38639, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "206bd72c0bbf060d", "flags": 3, "operationName": "upstream", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "56cd2ecf981d0918" } ], "startTime": 1666977689751295, "duration": 10840, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "56cd2ecf981d0918", "flags": 3, "operationName": "request", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "c3cd43544ecbc87c" } ], "startTime": 1666977689711999, "duration": 53300, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "ed8c4a2564960938", "flags": 3, "operationName": "request", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "0014a21066967d7c" }, { "refType": "FOLLOWS_FROM", "traceID": "b9906ed48bad4c78", "spanID": "b9906ed48bad4c78" } ], "startTime": 1666977689603563, "duration": 162488, "tags": [], "logs": [], "processID": "p2", "warnings": [ "invalid parent span IDs=0014a21066967d7c; skipping clock skew adjustment" ] }, { "traceID": "0014a21066967d7c", "spanID": "1ccad7dc6c81c815", "flags": 3, "operationName": "upstream", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "d523c4890747823a" } ], "startTime": 1666977689650664, "duration": 113907, "tags": [], "logs": [], "processID": "p1", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "05bd7c4d934a00f0", "flags": 3, "operationName": "upstream_ttfb", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "ed8c4a2564960938" } ], "startTime": 1666977689604361, "duration": 159975, "tags": [], "logs": [], "processID": "p2", "warnings": null }, { "traceID": "0014a21066967d7c", "spanID": "c9179afc072b179d", "flags": 3, "operationName": "code", "references": [ { "refType": "CHILD_OF", "traceID": "0014a21066967d7c", "spanID": "0a1a3ac8a681da9c" } ], "startTime": 1666977689651704, "duration": 112592, "tags": [], "logs": [], "processID": "p4", "warnings": [ "clock skew adjustment disabled; not applying calculated delta of -816.5µs" ] } ], "processes": { "p1": { "serviceName": "logic-proxy", "tags": [] }, "p2": { "serviceName": "eyeball-facing-service", "tags": [] }, "p3": { "serviceName": "cache", "tags": [] }, "p4": { "serviceName": "customer-code", "tags": [] } }, "warnings": null } ], "total": 0, "limit": 0, "offset": 0, "errors": null } ```
  1. Open it in Jaeger and see the following perfectly normal trace timeline:
image
  1. Switch to flamegraph view and see the following:
image

You can see that request_logic, which appears twice in the trace, is only shown once in the flamegraph.

The customer-code service that takes up a sizable chunk of time is also barely represented off to the side.

Expected behavior

Flamegraph shows all spans, they are organized in a logical way.

Relevant log output

No response

Screenshot

No response

Additional context

No response

Jaeger backend version

v1.39.0

SDK

No response

Pipeline

No response

Stogage backend

No response

Operating system

No response

Deployment model

No response

Deployment configs

No response

bobrik commented 1 year ago

I also made https://github.com/pyroscope-io/pyroscope/issues/1671, since we're mostly using what pyroscope provides.

Rperry2174 commented 1 year ago

Thanks for reporting and providing info to reproduce @bobrik Pyroscope team will take a look into this and fix