mozilla / glam

Mozilla's primary interactive dashboard for examining the distribution of telemetry values.
https://glam.telemetry.mozilla.org
Mozilla Public License 2.0
20 stars 23 forks source link

Inconsistencies on Glean graphs make analysis hard, especially when compared to Legacy #2768

Open edugfilho opened 3 months ago

edugfilho commented 3 months ago

Reported on Slack by @Standard8

Although the numbers are about the same, in the Glean case, the variance in the graph seems a lot greater. In part, this seems to be due to the variance of clients. The legacy case:

Screenshot 2024-03-21 at 09 26 10

and the Glean case: Screenshot 2024-03-21 at 09 23 28

Whilst I realise these are different systems and on nightly, the Glean one seems a lot more inconsistent. For example, there's virtually no client reports for a couple of days around the end of feb, then the big curved(?) dip just before 17 March.

From a measurement viewpoint, we get a nice drop in the legacy graph on 14th March. On Glean, there's so much variance, that I can't really tell if it has dropped or when.

Screenshot 2024-03-21 at 09 32 49

Or if it dropped then went back up again

That's all looking at the nightly channel. On release, Legacy has a reasonably consistent 80M clients, and level graphs.

Screenshot 2024-03-21 at 09 36 12

Glean has at best 20M clients, but dropping down to ~120k clients, and as a result the graphs have bigger changes, making it harder to figure out what's happening. Screenshot 2024-03-21 at 09 36 24

Sometime in the second half last year, I believe there was a known problem with release ingestion for Glean, but it was never announced if that has been fixed or not. I was therefore wondering if it had been affecting nightly as well because of the variance of client data there.

I don't know if the root cause here is a Glean issue or a Glam one, but unfortunately I think it makes the Glean graphs less useful due to the inconsistencies in them.

edugfilho commented 3 months ago

Regarding the first two images and client volume, I manually looked into each build for search_service_init2_ms (Legacy) and search.service.startup_time (GLEAN), which are the same measurements reported through GIFFT, and can confirm that, except for the drop that goes from Mar 14 to Mar 17 on Glean, all other client volume drops in Glean are from builds that are not present in Legacy.