mozilla / fxa

Monorepo for Mozilla Accounts (formerly Firefox Accounts)
https://mozilla.github.io/ecosystem-platform/
Mozilla Public License 2.0
602 stars 210 forks source link

Include anon_id in FxA server activity metrics #5313

Closed data-sync-user closed 4 years ago

data-sync-user commented 4 years ago

The proposed first deliverable for AET calls for a comparative analysis of Daily Active Users between FxA and Firefox, which means that FxA will need to somehow submit its own activity events into the AET pipeline.

At a high level, I think we want something similar to the once-per-day "fxa_activity - active" event that was synthesized over in https://bugzilla.mozilla.org/show_bug.cgi?id=1632635. For each user that was active according to the FxA service on a given day, there should be one AET ping with that user's ecosystem_anon_id submitted to AET.

I do not know the right way to arrange for that to be submitted. Perhaps we could include the ecosystem_anon_id in the existing metrics events like cert_signed and token_created, and have a daily job that summarizes those events, turns them into AET ping data, and submits them to the pipeline? Naively it sounds like such a setup would be very similar to what Bug 1632635 did for submitting into amplitude.

┆Issue is synchronized with this Jira Task ┆Issue Number: FXA-1885

data-sync-user commented 4 years ago

➤ Chelsea Lewis commented:

[~jhirsch@mozilla.com] and [~wclouser@mozilla.com] - can we get an estimate added here as well?

data-sync-user commented 4 years ago

➤ Dave Justice commented:

Some notes from a slack conversation with [~jhirsch@mozilla.com]

bq. Hey! Here’s some background on how the daily server events work right now: https://jira.mozilla.com/browse/FXA-1780?focusedCommentId=87627&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-87627

bq. I think we can either add the AET identifier (ecosystem_anon_id) to one of the existing events, or create a new event and fire it from the same spot, but pointed at a different data pipeline endpoint specifically for AET

bq. I think all we need to do for this issue is log the event with the ecosystem_client_id and ecosystem_anon_id. The routing from server logs to AET ingestion endpoint can be handled by data eng / ops on the backend

bq. Here is the existing spot where we record metrics when an oauth token is issued. I don’t know exactly how we’re going to get the ecosystem_client_id at this point in auth server, but the ecosystem_anon_id will be associated with the account. I think we’ll be able to get it from the auth DB by using the session token to fetch the account bq. https://github.com/mozilla/fxa/blob/main/packages/fxa-auth-server/lib/routes/oauth/index.js#L294-L296

bq. I think the exosystem client ID can be appended outside the auth server, by the pipeline code. dont worry about logging that for your issue

data-sync-user commented 4 years ago

➤ Dave Justice commented:

Opened up a draft pr at https://github.com/mozilla/fxa/pull/5929, ( https://github.com/mozilla/fxa/pull/5929, ) waiting for https://github.com/mozilla/fxa/pull/5911 ( https://github.com/mozilla/fxa/pull/5911 ) to be merged so I can use a method in my test. Filling out the data review request form today, will post an update once these two tasks are complete.

data-sync-user commented 4 years ago

➤ Jared Hirsch commented:

Per a discussion in the data stewards meeting earlier this week, storing any new persistent user identifiers (such as the AET identifier) requires review from the Trust folks.

Alicia took a look at the draft data review and related AET docs, and is provisionally OK with the data collection that we've proposed, but since Marshall has more context, we'll need to wait to land this until he has a chance to review.

I expect this will slip to end of day Thursday at the earliest, possibly a day or two later.

data-sync-user commented 4 years ago

➤ Dave Justice commented:

data request https://github.com/mozilla/fxa/pull/5929#issuecomment-659086657