squarewave / bhr.html

Mozilla Public License 2.0
5 stars 2 forks source link

Add support for more easily accessible historical data #24

Closed squarewave closed 6 years ago

squarewave commented 7 years ago

This will allow users to see historical trends for a particular criterion without having to wait for a long time for the normal, fully explorable historical data (which also might crash the content process.) The data is currently at https://analysis-output.telemetry.mozilla.org/bhr/data/hang_aggregates/historical_data.json, so we just need to build a viewer for it.

digitarald commented 7 years ago

As there is no priority assigned, is this scheduled to be in some state ready or usable by Austin? Devtools is using BHR data to tackle top hangs and seeing trends would allow us confirm the work happening.

squarewave commented 7 years ago

I'll make sure this is in a usable state by Austin, yes.

ochameau commented 7 years ago

The data is currently at https://analysis-output.telemetry.mozilla.org/bhr/data/hang_aggregates/historical_data.json

What is the structure of this data?

I see json[1] being a data about "DevTools Hangs" Then, in this object there is another one indexed like this: json[1][1] = { Gecko: {} Gecko_Child: {} Gecko_Child_ForcePaint: {} } I easily follow Gecko versus Gecko_Child (I imagine it is parent process versus child process?) But then what is ForcePaint?

Then, there is a big dictionary indexed by date with array of 8 entries. This is what I miss the most. What are these arrays? How to interpret them?

squarewave commented 7 years ago

@ochameau, sorry, it's a fairly ad-hoc format that's not arranged in a very self-documenting way. Those arrays form a histogram beginning with the count of 128-256 ms hangs and going up by powers of two. So, [<128-256ms hangs>, <256-512ms hangs>, <512-1024ms hangs>, ...]. Gecko, Gecko_Child, etc. are labels for threads that are tracked by BHR. Gecko is the main thread of the chrome process, Gecko_Child is the main thread of the content, and Gecko_Child_ForcePaint is I believe related to Bug 1279086.

@digitarald, does this dashboard suit your needs?

ochameau commented 7 years ago

The dashboard looks really good to me! The seven day average looks very handy.

Is this data only against Nigthly population? It looks like we got a growth of hangs starting from Oct 12th, reaching a maximum around Nov 1st and then going down. But I imagine it may also relate to overall usage of Firefox/DevTools?

squarewave commented 7 years ago

It is only against the Nightly population, yes. Regarding it relating to overall usage of Firefox, since it's normalized by usage hours, it shouldn't be directly related to that. However, if the Nightly population changes significantly then that can certainly affect things, and greater usage of devtools relative to total Firefox usage would also cause these numbers to go up. Comparing to all hangs, I don't see the same bump, so either the population changed in some way (or the population's behaviors, same thing), or the bump was caused by the code changing, or it was a fluke.

digitarald commented 7 years ago

@squarewave amazing dashboard!

~being able to disable the lower thresholds to get a higher resolution on the higher bounds would be great, as we care much more about > 2048ms than 128ms.~

This works 🎉 !

squarewave commented 7 years ago

@digitarald you can click the lower thresholds in the legend to stop showing them. This will also adjust the graph so you can see more resolution on the others.

ochameau commented 7 years ago

I think you can close this ticket, the dashboard is really helpful for us.

I only have two comments:

squarewave commented 7 years ago

are you confident about the data being really indexed by build date?

I can't see any way of it being indexed by submission date or anything else. There's a pretty clear path from grabbing 'application/buildId' and grouping by that, and submission date isn't even referenced anywhere in the job.