Open davidbarsky opened 4 years ago
I'm strongly in favour of more options for visualization. I think that broadly, we have two categories of options:
tracing
outputs.tracing
's. Potentially, some fidelity is lost when the other system can't represent something that tracing
emits, or when tracing
doesn't record something the other system expects. Also, there are sometimes semantic differences: if we implement an integration with a system that's based on callstack sampling, it would make sense to represent spans to the other system as "stack frames". However, a tracing
span is not exactly a stack frame — it might span multiple stack frames, be entered multiple times in different callstacks, et cetera, and represents a higher-level notion of context in the actual application logic, rather than being a detail of what actual functions are called.I think we definitely want to provide integrations with as many other existing tools as possible. There are already several — tracing-coz
, tracing-tracy
, and tracing-flame
, as well as the OpenTelemetry integration all come to mind, as well as tracing-wasm
's support for browser perf analysis tools. However, one thing about these other tools is that they tend to be tuned for a particular use case. For example, tracy
is intended for gamedev, and it has a first-class concept of "frames" (as in video, not as in stack frames), so it may not be suitable for debugging a microservice. In contrast, OpenTelemetry is a distributed tracing tool that makes request-response RPC models first class...which you probably don't want if you want to debug a game. And obviously, the browser performance profiling tools only make sense when you're running in the browser.
So, while integrations with other tools are very valuable, they don't really give us a solution that we can suggest to anyone who wants to be able to visualize trace data. For example, OpenTelemetry is a great option if you are implementing a microservice in a distributed application that already has all the infrastructure to use this data set up — if you're already running a Jaeger collector or something — but it's not something it would make sense to suggest to the rustc
maintainers. This suggests that we might want to think about doing our own thing that's specific to tracing
, in addition to supporting people who are implementing integrations.
I think that perhaps the best use-case to target with such a tool is interactive debugging. There are already several integrations that are more intended for performance profling use cases, like tracing-flame
; these tend to generate a static representation that records a single run of the software (e.g. a flamegraph SVG). We might want to think about a tool for interactive, on-line debugging of a running system, and/or for interactively exploring a saved capture from a running system. In particular, I might want to prioritize a syntax for interactively filtering/querying the data, and controlling how it is formatted.
In re: your specific suggestions:
- Use pprof. This might be kinda easy!
I think pprof is quite nice, and we should definitely have a layer for outputting the pprof format. However, it's very strongly geared towards profiling in particular, and its data model seems to emphasize performance data. I'm not sure if pprof output would be the ideal solution for an interactive debugging tool.
I have started working on a simple web gui that can separate out the log from different async components side by side. This is what I have so far. It's far from finished, needs a lot of polish and features.
What you see is a log file from a program which has 3 components. A server and a client that send messages back and forth through a relay. It uses the tracing instrument method to give instrumented executors to each of these tasks so their log statements are annotated.
The first image shows the filter boxes with the names of the instrumented executors filled in. The second image shows a scroll down, where you can see how the program flow goes through the three components.
Some next steps I am planning:
This is still a work in progress, but I'm traveling. Next week I will polish some more. What's missing right now:
It does have collapsible columns, detects log levels and colors them as well as letting you filter on them.
So I quickly published it on github pages so others can have a look at it: https://najamelan.github.io/tracing_prism/ You can find the code for now on my github profile.
It should be great if you have a wide screen, but I'm on a laptop right now so i haven't tested that yet myself. Give it a try. If you want to run it offline, you can download the repository and checkout the gh-pages branch. Be sure to configure firefox to allow loading scripts on a "file://" link.
If you want to compile it yourself, change the dependencies of the thespis crates to git links pointing to the dev branches.
Should this still be open? I'd love to help working on this, seems like an interesting issue
This reminded me that I forgot to update the post above. tracing_prism now does support JSON. I'm pretty happy with it personally. I looked into a more streamlined workflow than loading the log in the page with the browse button, but I felt that in the end it wasn't worth it, so I have put that on halt unless there is demand.
Of course people might prefer other ways of visualizing the logs, so I suppose @hawkw will get back to you explaining why she added the "help wanted" tag...
It would be nice to have support to one of the formats that profiler.firefox support, like: https://docs.google.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/preview#heading=h.yr4qxyxotyw
Feature Request
Motivation
A decently common feature request that comes up in conversations is for tracing to provide some sort of visualization of spans and events in some sort of GUI. This would help
tracing
users better understand and debug their applications.Proposal
At a high level, I can think of a few options:
Alternatives
Don't do this.