golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.71k stars 17.5k forks source link

cmd/trace: problems and proposed improvements #33322

Open jcorbin opened 5 years ago

jcorbin commented 5 years ago

Writing this issue out of the GopherCon 2019 contribution workshop, as a sort of umbrella task to encompass various limitations in the current go tool trace, and to speculate about possible solutions.

NOTE: most of these remarks are as of Go 1.11, which was the latest version that I reconciled into an experimental prototype fork. All feedback on what I've missed in the 1.12 or upcoming 1.13 trace tool code is more than welcome. As I write this issue, I'm skimming through the code on master to try to verify my statements, but the last time that I did a truly deep look was 1.11.

Problems

  1. Event Parsing: the internal/trace parser used by the current tool is a rather slow 2-phase parser; also being internal, it cannot be re-used by other tools interested in processing trace files
  2. Event Processing: currently the trace tool has two separate state machines for processing events: one for generating catapult-compatible JSON, the other for generating a statistics table
  3. Lack of export: other than looking at the data in the catapult viewer, or scanning the html statistics table, there's no (easy / good) way to export data for analysis in something like the PyData stack (e.g. I've found a CSV file into a pandas dataframe to be quite effective for exploration)
  4. Viewer Chunking: finally, and perhaps most superficially, most of the reason for splitting traces (chunking them into different time sections), as far as I can tell, is 100MB limits on strings in V8 wrt JSON parsing. This is largely because the catapult JSON data is typically between 10x and 100x larger than the native binary data.

Proposed Improvements

Parsing (library?)

A golang.org/x/trace library for dealing with trace files would be great, things it could cover:

Many of these improvements could be made in place inside internal/trace, but are perhaps better explored first / concretely proposed through a standalone personal repository, eventually perhaps for a golang.org/x package.

Event Processing (library?)

However much of what you need when dealing with trace data, is a bit of a state machine that tracks what various go runtime objects. There's at least two separate ones of those within the current trace tool: one for creating catapult JSON, the other for tabulating statistics.

This state machine, just like the events themselves, are highly dependent on the Go runtime, and necessarily need updates for new versions. I expect there can be a vigorous discussion around what layer to is best to export in a library: should it be the parser above (low or high level), or instead something more processed like spans of goroutine-relevant time...

Data Export Mode (tool change?)

Library or not, making the trace tool able to be used in an offline way to simply transform the binary data into a useful form for consumption (by Unix tools, by Pandas, etc) has proven quite useful in my experimental fork.

One useful pivot on the raw events, is to group them into spans of goroutine-relevant time, e.g. (conversationally):

Basically, you can think of goroutines as going through this state machine (a direct analog of the classic Bach 86 Unix process state machine): Goroutine Lifecycle (3)

A really fruitful dataset then, is a table of time spent by each goroutine in each state:

Better Trace Viewing

Now that most javascript runtimes support streaming, perhaps we could do better than static 100MB chunks.

As even more of a green field idea, maybe we could use something like http://gioui.org to build a dedicated trace viewer that wouldn't be held back the limitations of DOM parsing or rendering.

ianlancetaylor commented 5 years ago

CC @dvyukov @hyangah @pjweinbgo