Open sergiimk opened 1 year ago
I think I am able to reproduce what you're seeing, though to me I see a big stack of do_stuff
as if it was a weird recursive call.
Getting the id is working as intended, as I chose to fetch the root id (as seen here). The question is, is there a better/more intuitive behavior?
If I set "id" to the root id, you can see the parent-child relationship between async spans with them stacked on each other. In this case, each independent task needs to be a root span to display correctly.
The only other thing I can think to supply for "id" is the unique id for the span. This would show every span as an independent task, and lose the parent-child stacking relationships.
I'm open to more ideas but at the moment I don't see a clearly better behavior. I'll give it some thought when I can, though.
Thanks for your reply @thoren-d.
Firstly I made this tiny app to reproduce the issue:
Which generates trace like this:
Now regarding how to fix this:
I totally agree with you that flame-graph-like stacked view is really nice, and it would be unfortunate to lose it by splitting all coroutines into their own tid
... although this is basically how tools like Jaeger visualize things:
What helps in Jaeger's case a bit is:
Also want to point out that although the parallel
case in my example (tokio::spawn
with root spans per task) looks good - when you do a fork-join processing you most likely will want to tie the spawned tasks to a common root span (which produces invalid graph like in parallel_linked
case).
In our app we also frequently do fork-join without tokio::spawn
at all (concurrent
case) as we don't need/want extra threads.
I'll need to think a bit more about this, but so far I see only two options:
ChromeLayer
tid
Perhaps Perfetto is just not the right tool for this job of displaying flows of execution flows, and only suited for profiling what CPUs are actually doing ... but I really wish to be able to do both in one tool.
They do seem to have a higher level "Flows" feature:
After some experimentation I ended up writing a custom layer that:
ph: B/E
)ph: b/e
) with unique IDThe first one gives me a (stacked) flame graph of all coroutines that were actively running on a CPU per thread.
The second one gives a view of async jobs, with one track per timeline (similarly to Jaeger).
It's just like running with TraceStyle::Threaded
and TraceStyle::Async
at the same time, with latter un-stacked.
Here's how result looks like with a test app I shared above:
It's not perfect, as amount of timelines can get quite large, but it suits my app.
Perfetto devs are also considering adding nested hierarchical grouping to timelines, which would allow fully replicating Jaeger's layout.
Also there is this issue that points out that old Chrome tracing actually supported non-stackable concurrent events, so that might be another option for displaying concurrent tasks within one timeline.
I have a tokio-based CLI program that creates a root span upon start and then processes multiple files concurrently using the code like:
I configure the layer with
TraceStyle::Async
, but even then the resulting graph in Perfetto looks as if all thedo_stuff
spans are mushed together into one bar instead of being treated as independent concurrent tasks.Looking at the trace file, all
do_stuff
events seem to have"id": 1
, so Perfetto likely unable to tell them apart.Is this expected / current limitation / or perhaps I'm doing something wrong?