mozilla / glam

Mozilla's primary interactive dashboard for examining the distribution of telemetry values.
https://glam.telemetry.mozilla.org
Mozilla Public License 2.0
20 stars 23 forks source link

Assess challenges re: implemention iOS ETL #1105

Closed rafrombrc closed 2 years ago

rafrombrc commented 3 years ago

We need to dig in to the iOS versioning and build ids and etc to see if there will be any unexpected pain in getting the iOS data into GLAM.

acmiyaguchi commented 3 years ago

There are a few things that will need to implemented on the frontend in order to support iOS data. There are currently the following apps sending ios data:

Some examples of build_id and version values are as follows:

build version
3279 30.0
3120 29.2
18809 28.0

Glam currently relies on the fact that the build id (so far) has encoded some sort of date that we can use to build plots over time. We do not have a source of truth for iOS builds like buildhub or geckoview version, so it will be tricky to convert this into a suitable format for GLAM. The versions follow semantic versioning, so it should be okay for what we're currently doing (using only the major version for aggregates).

Here is the notebook with a few small queries to figure out the shape of the two fields. https://colab.research.google.com/drive/1bVkze2EkTwr8fTB5SQQRXpkkdm3yrQu9?usp=sharing

rafrombrc commented 3 years ago

We know that Fenix encodes dates into opaque build ids, and we have a UDF to decode this... hopefully iOS is doing something similar, we should check to see if these build ids are encoding any info.

st3fan commented 2 years ago

On iOS the build numbers are simply values that increment every time our CI environment does a build. They have no meaning. This build increment also happens for pull request builds or test builds, so the numbers can jump up significantly between official releases.

Note that for beta (testflight) releases we can have many builds in use, all with the same version (39.0) but different build ids (8767, 8797, 8798, ...)

If GLAM needs a date stamp that is unique for each build then we can easily add this.

st3fan commented 2 years ago

I noticed that two applications are missing in the list in the issue story:

st3fan commented 2 years ago

If GLAM needs a date stamp that is unique for each build then we can easily add this.

To clarify: on iOS we would prefer to stick with the current build number scheme, but we can easily add a glean value that contains a build datetime.

alekhyamoz commented 2 years ago

Thank you Stefan for all the information. We wanted to check, if there was a way to pull the datetime associated with the build via an API (related to CI) If not, adding build datetime would be fitting.

If we decide to add the datetime, my understanding is, it would be going forward and cannot be backfilled. Is that correct ?

st3fan commented 2 years ago

@travis79 From a CI/Build perspective it is really simple to inject a datetime into the app. But I don't know how to transfer that to Glean. Should we just add a metric for that? Or is there a Glean API for this?

travis79 commented 2 years ago

There currently isn't a way to send a user defined metric in all Glean pings, so we would only see this in the metrics ping, and in any custom pings it was added to. If that is sufficient, then I'm fine with it. If you really want this in every ping, then it might take a change to Glean in order to accomodate. cc @badboy for visibility into this.

rafrombrc commented 2 years ago

@travis79 @st3fan As @alekhyamoz says, if there's an API somewhere from which we could pass in a build id and get back a build date, that would be better for us than having the build date added to every Glean ping. Or we could even just create a BigQuery table that stores these mappings and then look them up when we need them. Does Glean have a mechanism for supporting the idea of an "out of band" ping that contains app metadata or something? Literally one single ping per build with that info would be enough.

st3fan commented 2 years ago

if there's an API somewhere from which we could pass in a build id and get back a build date, that would be better for us

Unfortunately there is not. But we could do the reverse: our CI system could ping another system to announce a new build. Would that work?

(I'm not in favor of these kind of inter-infrastructure dependencies btw - moving things into Glean would be my preference)

travis79 commented 2 years ago

The easiest path would probably be to add this as an optional measurement to the Glean "client_info" that is sent in all pings, the discussion around this has picked up again, so hopefully we can arrive at a solution that doesn't rely on CI systems connecting the dots for us.

rafrombrc commented 2 years ago

Okay, is there an issue tracking this work that we can follow, @travis79?

travis79 commented 2 years ago

I don't see a bug in Glean for this yet. If you wanted to file one, here is the right component

alekhyamoz commented 2 years ago

@travis79 Filed a bug to track the issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1742448 Please let me know if any further information is required.

alekhyamoz commented 2 years ago

Jan-Erik and I came up with a proposal document https://docs.google.com/document/d/1_7kTePQHHRhsAqOYPiw8ptoN9ytRnsWMcN-tddnV0Cg/edit#heading=h.bpxdzoa6nutw Looking forward to implementing this in H1 2022

alekhyamoz commented 2 years ago

Assessed!