catapult-project / catapult

Deprecated Catapult GitHub. Please instead use http://crbug.com "Speed>Benchmarks" component for bugs and https://chromium.googlesource.com/catapult for downloading and editing source code..
https://chromium.googlesource.com/catapult
BSD 3-Clause "New" or "Revised" License
1.92k stars 564 forks source link

Build tiny slices into the slice hierarchy correctly #1131

Open catapult-bot opened 8 years ago

catapult-bot commented 8 years ago

Issue by yuhao Friday Jul 24, 2015 at 21:37 GMT Originally opened as https://github.com/google/trace-viewer/issues/1131


By tiny slices, I mean instant events or X events with duration of 0.

Currently, there are cases where tiny slices are not correctly built into the hierarchy.

As an example, in this trace (https://github.com/natduca/measurmt-traces/blob/master/mobile/googlemaps_ipc_newbinding.html), there is an 'X' event TaskGraphRunner::CollectCompletedTasks (starting at 1,935.145 ms) with duration of 0, and an 'I' event DidUpdateVisibleTiles right (starting at 1,935.175 ms). Both tiny slices are misplaced with respect to the entire slice hierarchy.

catapult-bot commented 8 years ago

Comment by yuhao Friday Jul 24, 2015 at 21:40 GMT


The reason this bug is important to fix is that with the new flow API, we will inevitably create some tiny slices where flows bind. Incorrect placement of tiny slices is important to reason about precise causality.

@natduca @dj2 @vmpstr

catapult-bot commented 8 years ago

Comment by natduca Saturday Jul 25, 2015 at 00:07 GMT


Do we have a reduced test case?

catapult-bot commented 8 years ago

Comment by philippebeaudoin Monday Jul 27, 2015 at 14:51 GMT


Let me try to build one.

catapult-bot commented 8 years ago

Comment by philippebeaudoin Tuesday Jul 28, 2015 at 16:07 GMT


Here's the reduced trace. Looking at the code I find no issue with "TaskGraphRunner::CollectCompletedTasks", both on the original and reduced trace it appears below "ZeroCopyTileTaskWorkerPool::CheckForCompletedTasks" as expected. However, looking at the code I would expect "DidUpdateVisibleTiles" to appear under "TileManager::UpdateVisibleTiles" although at the moment it's under "ZeroCopyTileTaskWorkerPool::CheckForCompletedTasks".

My proposal to solve this problem with be to have events include a "depth" parameter indicating how many slices are open above them in their current thread. When present this parameter would help inform subslice creation to ensure the causality is correctly reflected.

{"traceEvents":[ {"pid":27369,"tid":27369,"ts":326459134461,"ph":"X","cat":"cc","name":"TileManager::UpdateVisibleTiles","args":{},"dur":31,"tdur":30,"tts":1964871}, {"pid":27369,"tid":27369,"ts":326459134461,"ph":"X","cat":"cc","name":"ZeroCopyTileTaskWorkerPool::CheckForCompletedTasks","args":{},"dur":31,"tdur":30,"tts":1964871}, {"pid":27369,"tid":27369,"ts":326459134461,"ph":"X","cat":"cc","name":"TaskGraphRunner::CollectCompletedTasks","args":{},"dur":0,"tdur":0,"tts":1964871}, {"pid":27369,"tid":27369,"ts":326459134492,"ph":"I","cat":"cc","name":"DidUpdateVisibleTiles","args":{"stats":{"canceled_count":0,"completed_count":0}},"tts":1964901,"s":"t"} ]}

catapult-bot commented 8 years ago

Comment by natduca Wednesday Jul 29, 2015 at 07:19 GMT


Can you fully reduce this? shift the ts to zero, simplify the pids, etc, and remove the chrome specific stuff, remove the tts since its not important for the repro, etc. Making it easy to read is good.

We should seek solutions that do not adjust the way tracing c++ side works. For instance, the position in a trace file seems useful.

catapult-bot commented 8 years ago

Comment by yuhao Thursday Jul 30, 2015 at 18:16 GMT


{"traceEvents":[ {"pid":0,"tid":1,"ts":0,"ph":"X","cat":"cc","name":"A","dur":31,"tdur":30}, {"pid":0,"tid":1,"ts":0,"ph":"X","cat":"cc","name":"B","dur":31,"tdur":30}, {"pid":0,"tid":1,"ts":0,"ph":"X","cat":"cc","name":"C","dur":0,"tdur":0}, {"pid":0,"tid":1,"ts":31,"ph":"I","cat":"cc","name":"D","s":"t"} ]}

catapult-bot commented 8 years ago

Comment by yuhao Thursday Jul 30, 2015 at 18:20 GMT


Actually by just looking at the reduced trace, I am not sure how it should look like....

Why should D be under A but not next to A?

catapult-bot commented 8 years ago

Comment by philippebeaudoin Thursday Jul 30, 2015 at 18:39 GMT


This is exactly the problem. If you look at the c++ code you see it's below. But there is no info in the trace to capture that. For example: { TRACE_EVENT0("a", "b"); ... TRACE_EVENT_INSTANT0("a", "c"); }

Yields the exact same trace as: { TRACE_EVENT0("a", "b"); ... } TRACE_EVENT_INSTANT0("a", "c");

The problem is that we do TRACE_EVENT0 with a single "X" event. If it was done with a "B"/"E" pair then event order would allow us to know whether event "c" is inside or outside the brackets. But with a single "X" event there is just no way to know.

I was looking at ways to add as little information as possible to the C++ trace to resolve this ambiguity in a way where we could be backwards compatible. The best idea I've add is a "depth" field. In this case the two pieces of code would yield different traces. The first one: {"pid":0,"tid":1,"ts":0,"ph":"X","cat":"a","name":"b","dur":10,"tdur":10,"depth":0}, {"pid":0,"tid":1,"ts":10,"ph":"I","cat":"a","name":"c","s":"t","depth":1} The second one: {"pid":0,"tid":1,"ts":0,"ph":"X","cat":"a","name":"b","dur":10,"tdur":10,"depth":0}, {"pid":0,"tid":1,"ts":10,"ph":"I","cat":"a","name":"c","s":"t","depth":0}

catapult-bot commented 8 years ago

Comment by philippebeaudoin Thursday Jul 30, 2015 at 18:40 GMT


First sentence should have been: If you look at the C++ you see "DidUpdateVisibleTiles" is called by "TileManager::UpdateVisibleTiles" not "ZeroCopyTileTaskWorkerPool::CheckForCompletedTasks".

catapult-bot commented 8 years ago

Comment by yuhao Friday Jul 31, 2015 at 06:04 GMT


How do we pass the depth info around? Seems like then each API is stateful (depth is the state) whereas in our current design each API is stateless.

catapult-bot commented 8 years ago

Comment by yuhao Friday Jul 31, 2015 at 06:06 GMT


This is very annoying.... If we really think about it, it's because Chrome timer resolution is not high enough. I remember discussing with Vlad about this, and sounds like there is a plan to move to a better timer implementation? Did we ever consider using something like RDTSC, which gives us machine cycle counts? @vmpstr

catapult-bot commented 8 years ago

Comment by natduca Tuesday Aug 04, 2015 at 00:24 GMT


Darn, it looks like we really do have a problem here.

Lets avoid assuming that ts precision can be changed and focus on being resilient.

I think that implies that depth is needed, right? The minimal case that proves this is X, followed by X. We don't know if thats two things that are siblings, or whether the second X is inside.

Is that a correct restating of the problem?

@xianzhu can advise on email if it would affect our performance to track depth at the trace event level, but we should coordinate with @fmeawad in case we change the embedding api to blink and v8 and skia.

Also, how important is this compared to all the other things we have floating around? Is this blocking us getting good related events all the time, or does it only crop up 1% of the time? I don't mind either way: more the thing I'm struggling with is that we have a lot flying around here to get associated events, and it'd be good for us to have a shared sense of which needs to move first.

Sounds like in an ideal world, we would get going on this sooner than later... @vmpstr, maybe you can help here?

philippebeaudoin commented 8 years ago

Nat: Your reduced restating is correct.

This shows we need extra data in the trace, and I think depth is nice and concise so I advocate for it. I believe it can be computed by keeping one counter per thread and accessing it at TraceEvent construction and destruction. I'm not clear on the kind of overhead this entails though. Probably not too bad if we store the counter in a thread local storage (this looks like the perfect use-case for TLS).

Happy to take a deeper look at this when we believe it's a pressing issue.

dj2 commented 8 years ago

Do we need something as complicated as depth, or, in the cases where we have this hierarchy issue, just use B/E instead of X? This is an issue specific with very short nested X events.

philippebeaudoin commented 8 years ago

B/E is an OK workaround. If we go this route we'll need to fix the trace viewer code to use event ordering to determine subslices (it currently doesn't).

One issue with this workaround is that it requires us to be on the lookout for any error and fix them in C++ after we've spotted them in the traces. The traces are potentially broken until the new Chrome version ships. We could have trace viewer spit out a warning whenever we encounter the potentially problematic pattern to help us find them.

Another issue is that if this happens with frequent events or with many different events then we may end up with many begin/end (and significantly larger traces).

Finally, I just don't like the fact that begin/end is not equivalent to TRACE_EVENT. But I'm ready to overlook my philosophical principles in the name of pragmatism. ;)

dj2 commented 8 years ago

The question is, how much complexity is adding depth creating?