taskcluster / taskgraph

Generates task dependency graphs for Taskcluster CI
Mozilla Public License 2.0
16 stars 42 forks source link

Add a tool for tracing transforms #6

Open ahal opened 2 years ago

ahal commented 2 years ago

Imported from: https://bugzilla.mozilla.org/show_bug.cgi?id=1676972

One of the biggest challenges of debugging the taskgraph is tracing what happens in the transforms. Let's say there's an oddity in test-windows7-32/opt-mochitest-browser-chrome-e10s-1 and you want to debug what's going on. The experience is miserable because:

  1. There are thousands of tasks, so whether using logging or a debugger, you'll have to sift through or filter out all of the irrelevant tasks.
  2. The labels are not finalized until the end. So the task isn't going to be called test-windows7-32/opt-mochitest-browser-chrome-e10s-1 in the beginning, making it hard to narrow down on a single task.
  3. Tasks can split out into more tasks which complicates matters further.

I'd love to have a --trace-transforms flag that when specified dumps out logging from transforms that is specific to the final task at the end. I envision this being used in conjunction with --tasks-regex so we can limit which task logs are dumped.

The implementation here is going to be difficult, but I think it should be possible. Roughly my idea is:

Have a special TransformLogger that buffers logs in a tree-like data structure. Each node in the tree contains the output from a single yield of a single transform. Each node (except the root) has a single parent that represents the output from the previous transform. Each child node (there can be multiple if a transform splits tasks), contains the output from the next transform. Each leaf node contains a finished task label.

Then, when --trace-transforms is passed in:

For each leaf node matching a task specified by --tasks-regex (might be all of them if not specified), we can reverse back up the output tree and stitch together only the output from nodes along that path. Which should give us the full output for only that specific task. We can add some extra logs to delineate where a transform started and finished. We can also log transform durations.

If --trace-transforms is not passed in, we can make the logging calls a no-op so there wouldn't be any perf penalty.

ahal commented 2 years ago

Aki writes: Sounds like https://trello.com/c/HDlSBwXG/17-allow-for-tracing-transforms ? :) I think it would be a great tool to have. I'm not sure how high we can prioritize it.