Open philipp-sumo opened 7 years ago
@philipp-sumo did you look at the current dashboard that I linked in the post? it basically does what you describe, just not with the exact split you're talking about.
The split that you're talking about would result in 120 different aggregations (4 windows versions 3 gpu types 5 locales * 2 accessibility) just for the factors you mentioned. Adding this set of faceted aggregates to the dashboard is not out of the question, but we need to figure out some reasonable way to display all this information so that important/critical information stands out and doesn't get lost in the noise.
thank you. the main goal i had with this request was to be able to catch trends that wouldn't be reflected in existing dashboards, as some crash issues might not move the overall rates enough to be detected in a timely fashio or at all. i'd envision one page/dashboard per release channel that contains a number of different charts (the facets mentioned above). in addition, a default view of 7 or 14 days in such graphs may be beneficial so that you have better context when looking at current developments. if such a thing existed it would make monitoring crashing trends a fair bit easier :-)
A way to show this info would be to show a single graph with a few dropdowns to filter by dimension.
@marco-c we will probably add something like that (either in the detail view or by linking to redash/stmo), but part of the point of this project is to point people in the right direction off-the-bat, instead of forcing them to manually look at the different dimensions of the same aggregate to try and find problems.
Something like https://telemetry.mozilla.org/new-pipeline/evo.html would be nice in my opinion, at least when you have to dig deeper.
hi, i read the recent https://wlach.github.io/blog/2017/10/mission-control/ blog post and thought the approach could be very useful in detecting crash spikes for particular subsections of the user-base.
for example i'd think about a dashboard containing graphs about the current crash main+content-contentshutdown crash rate (the main metric that we use at other places too) for users on each release channel (esr - release - beta - nightly) split up by factors like: