Closed squarewave closed 6 years ago
As there is no priority assigned, is this scheduled to be in some state ready or usable by Austin? Devtools is using BHR data to tackle top hangs and seeing trends would allow us confirm the work happening.
I'll make sure this is in a usable state by Austin, yes.
The data is currently at https://analysis-output.telemetry.mozilla.org/bhr/data/hang_aggregates/historical_data.json
What is the structure of this data?
I see json[1]
being a data about "DevTools Hangs"
Then, in this object there is another one indexed like this:
json[1][1] = {
Gecko: {}
Gecko_Child: {}
Gecko_Child_ForcePaint: {}
}
I easily follow Gecko versus Gecko_Child (I imagine it is parent process versus child process?)
But then what is ForcePaint?
Then, there is a big dictionary indexed by date with array of 8 entries. This is what I miss the most. What are these arrays? How to interpret them?
@ochameau, sorry, it's a fairly ad-hoc format that's not arranged in a very self-documenting way. Those arrays form a histogram beginning with the count of 128-256 ms hangs and going up by powers of two. So, [<128-256ms hangs>, <256-512ms hangs>, <512-1024ms hangs>, ...]
. Gecko, Gecko_Child, etc. are labels for threads that are tracked by BHR. Gecko is the main thread of the chrome process, Gecko_Child is the main thread of the content, and Gecko_Child_ForcePaint is I believe related to Bug 1279086.
@digitarald, does this dashboard suit your needs?
The dashboard looks really good to me! The seven day average looks very handy.
Is this data only against Nigthly population? It looks like we got a growth of hangs starting from Oct 12th, reaching a maximum around Nov 1st and then going down. But I imagine it may also relate to overall usage of Firefox/DevTools?
It is only against the Nightly population, yes. Regarding it relating to overall usage of Firefox, since it's normalized by usage hours, it shouldn't be directly related to that. However, if the Nightly population changes significantly then that can certainly affect things, and greater usage of devtools relative to total Firefox usage would also cause these numbers to go up. Comparing to all hangs, I don't see the same bump, so either the population changed in some way (or the population's behaviors, same thing), or the bump was caused by the code changing, or it was a fluke.
@squarewave amazing dashboard!
~being able to disable the lower thresholds to get a higher resolution on the higher bounds would be great, as we care much more about > 2048ms
than 128ms
.~
This works 🎉 !
@digitarald you can click the lower thresholds in the legend to stop showing them. This will also adjust the graph so you can see more resolution on the others.
I think you can close this ticket, the dashboard is really helpful for us.
I only have two comments:
are you confident about the data being really indexed by build date?
I can't see any way of it being indexed by submission date or anything else. There's a pretty clear path from grabbing 'application/buildId' and grouping by that, and submission date isn't even referenced anywhere in the job.
This will allow users to see historical trends for a particular criterion without having to wait for a long time for the normal, fully explorable historical data (which also might crash the content process.) The data is currently at https://analysis-output.telemetry.mozilla.org/bhr/data/hang_aggregates/historical_data.json, so we just need to build a viewer for it.