Closed data-sync-user closed 3 years ago
@6a68 I just assigned this to myself but after rereading the ask a couple times, I want to make sure I've got this right - we're clearing... the anon_id
? We'll hit generateEcosystemAnonId
from a password reset here which updates the value here so this feels mostly done already. Is this issue essentially to double check the old one is donezo?
Edit: Hmmm, rereading the title, I realize it says where it isn't updated, so maybe this an edgecase of #6017. Makes more sense, we want to make sure the old value remains the same if there's an error in anon_id creation/update - let me know if that's incorrect.
Re: comment above - the description has been updated per Slack conversations. cc @6a68 to check my accuracy of summarizing ;)
I’m not sure how to generate a placeholder or how to know if this placeholder is not a real anon_id for checking if we need to refetch on profile load, but it might be more obvious once I dig in.
maybe retry with exponential backoff
To the last point, if we’re doing a check on profile reload each time to see if it needs to be regenerated, maybe we don’t even need an exponential backoff but just let it try again on next profile reload. I’d imagine this would be a pretty rare occurrence.
Your summary in the description looks accurate to me :+1: but this is one of those things we didn't fully work out ahead of time, and I think we missed something.
I’m not sure how to generate a placeholder
This part would be easy, just use crypto.getRandomValues
. There might even be a nice wrapper in the content server's app/scripts/lib/crypto
or in the new fxa-auth-client (formerly fxa-js-client, just landed).
how to know if this placeholder is not a real anon_id
Yup, this is the missing piece.
By the magic of crypto, the random placeholder and anon_id are indistinguishable. The only way to know a placeholder value is a placeholder is either to generate the anon_id and compare the two values, or to mark the account as having a stale/placeholder anon_id. (Or some third approach I haven't thought of.)
I think generating the anon_id and checking against the stored value is a nicer approach, because it doesn't introduce new bits of state, and because it reduces the window of placeholder usage, whether that is the AET bootstrapping placeholder uploaded by Sync, or the password reset placeholder uploaded by FxA. But it entails checking the anon_id against the stored value each time the user enters their password, something we just decided not to do.
On the other hand, if we want to instead put some state on the account to mark an anon_id as stale/placeholder, we'd have to either add a field to the account (mysql changes), or make a new list in redis of affected users or anon_ids, and then (handwaving here) add some API to allow the front-end to ping the server to ask if the user has a 'real' anon_id. We would also never replace the Sync-generated placeholder until a password reset, which doesn't seem ideal. This 'storing extra state' approach seems like more work and more complexity, but I could be missing something; happy to discuss further.
Now, it's possible there is some still-missing edge case here, because this is an incomplete part of the design: password resets aren't yet documented in the main technical doc for AET, and the last time @rfk and I talked about this, we talked about atomically clearing the anon_id on password reset, but not about maintaining data continuity across that flow.
I'm not sure just where in the account model we'll want to do this more eager anon_id generation and checking, but I bet @vbudhram can help.
Ah, another point I just thought of: we want to slowly enroll users in AET, or at least be able to control the rate at which the whole set of FxA users is enrolled in AET and pings start flowing in. So, when we do the anon_id comparison, I think we should just bail early if the current value is falsy.
This seems like a decision we'd want to take to the larger AET group to make sure it sounds good to everyone, and it might be something we revisit later.
I am far from certain, but it looks like this promise chain might be the right spot to add the anon_id check-and-regenerate call: https://github.com/mozilla/fxa/blob/main/packages/fxa-content-server/app/scripts/views/complete_reset_password.js#L181
Ah, nope. Looks like account model has both the resetPassword
(start of reset) and completePasswordReset
(end of reset) methods, that's probably the right level for anon_id twiddling (the views don't need to know about the anon_id at all)
we want to slowly enroll users in AET
@6a68 In train 181, the AET values will get set on password change and reset. Do we want to disable this until the sync placeholder parts in ready?
Yea that looks right too me 👍🏽
we want to slowly enroll users in AET
@6a68 In train 181, the AET values will get set on password change and reset. Do we want to disable this until the sync placeholder parts in ready?
Good question, I've been thinking about this a bit. If the existing anon_id is falsy, we can just do nothing. I think that'll solve the gradual onboarding problem.
Yea that looks right too me 👍🏽
Great! Thanks :beers:
@6a68 In train 181, the AET values will get set on password change and reset.
This should only be in staging I believe. The data pipeline public keys are set to an empty array in production right now.
public keys are set to an empty array in production right now.
Ah, yup then we should be good
@dannycoates suggested we skip the placeholder and just update anon_id when we set the new password. Seems like an improvement to me 👍
just update anon_id when we set the new password
@6a68 I'm not sure I understand what this means or Danny's comment on "It’s ok to use anon1 while the reset is in progress" - aren't we setting the password at the end of a password reset? Do we mean with PW reset we'll send up the real, new anon_id instead of a placeholder (in which case kB
will change after PW reset so ? 🤔), or that we aren't going to update the anon_id during a password reset but only on regular password change when kB
doesn't change so we can generate and send up in the same request?
I think the idea is, rather than:
Danny's suggestion is to just
A little more explicitly, I think this means the only time we'd generate a new anon_id would be inside the reset completion step, somewhere in this promise chain https://github.com/mozilla/fxa/blob/main/packages/fxa-content-server/app/scripts/views/complete_reset_password.js#L181
Forgive me for the drive-by engineering, I'm working on Send at the moment and not keeping up with the convo, but yeah, Jared's got the gist. Ideally, (not necessarily now) I think it'd be best to push the eco(Anon)Id handling all the way down into fxa-auth-client. With resetPasswordWithRecoveryKey
and passwordChange
being the main functions where we'd trigger calculating new ones.
Forgive me for the drive-by engineering
Nothing to forgive, that's one reason the team swarming approach can be great. This is a much simpler approach so I appreciate and thank you for seeing this and dropping in before I'd started. 😁
➤ Jared Hirsch commented:
After team discussion, this doesn't seem essential for the scoped keys epic. Moving out to a future placeholder epic.
In a discussion earlier today, we determined this issue should be punted out to another epic where we tackle frequent AET identifier rotations.
I started moving the call to update the anon_id into fxa-auth-client
where we send the request for password reset/change, but this has a couple of issues:
It may make sense later to move all the handling into fxa-auth-client
, but we also like the separation keeping it in the account model gives us and that the account model is already setup to work with localstorage. This is more of an issue with password reset than it is with changePassword
due to a new kB
, but at this point in time, the crash window appears to be a non-issue because of how infrequently we are rotating and how small the window is.
The longer term solution for this when we rotate keys opportunistically when users enter their passwords could be to add the anon_id to at least the the PW reset request so that there is only one network request instead of two.
It was also discussed to make sure we send the correct headers when we send the update from the client so that when we do update we can ensure there’s not two clients that disagree on the value, but this doesn’t matter right now because the clients can keep using the anon_id and AET will still unwrap them to the same eco ID.
We discussed adding additional logging but decided it was unnecessary since we log successful password changes/resets and when the anon_id is updated.
➤ Jody Heavener commented:
AET is no longer on our roadmap.
There is a small window between a successful password reset and a successful reset of the anon_id. If something were to crash or fail during this time, the user’s password would be reset but their anon_id would not be rotated properly.
This could be considered a partial spin-off of #6017 but the scope may extend into other areas.
Originally it was suggested that we could clear the old anon_id value because at the next sign-in (or fetch of the user profile), we could then generate and set a “real” value. However, we want to maintain anon_id continuity across password resets. One solution could be to send a placeholder anon_id into the db at password reset, marked as “dirty” or to be replaced by the client as soon as possible.
The solution could look like this:
┆Issue is synchronized with this Jira Task ┆Issue Number: FXA-1236