Closed philbooth closed 3 years ago
@davismtl opened mozilla/fxa-amplitude-send#72, which is related to this (it turns out that the change in #2505 was probably wrong). I opted to close that issue and keep this one because the work will be done in this repo, the import scripts are not involved.
Here's the opening comment from that issue, for posterity:
This might not be a bug but I want to make sure that this is the expected behavior.
We are observing an increase in utm_source = (none) at the expense of utm_source=email which is declining. We made changes to this because email was over counting so it was to be expected. However, I was expecting to see the change for registration parameters but not so much for logins.
This chart shows an increase of login with no utm source: https://analytics.amplitude.com/mozilla-corp/chart/2ngrhz1
Confirmation that it aligns with our train 116 release: https://analytics.amplitude.com/mozilla-corp/chart/2ngrhz1/edit/i6w4h1h
How confident do we feel in the new numbers now? Is this what we were expecting to see change? How would we explain the change in login utm params?
/cc @irrationalagent
I'll look into this more in depth monday, but when I first saw this issue my first hunch was that all UI-based (menupanel etc) logins that were previously (erroneously) getting utm_source = email are now getting (none).
I'm basing this on the fact then when testing locally I noticed earlier steps in the login funnels for these entrypoints had (none) for utm_source (until after the email confirmation step, when it was clobbered).
Now that the clobbering is over, (none) may be carrying through to the login complete step.
More generally its pretty clear i need to finish my audit of these parameters, so I'll do that ASAP.
all UI-based (menupanel etc) logins that were previously (erroneously) getting utm_source = email are now getting (none)
This is what the change was aiming to do, so it's a pretty good hunch in that sense! But it sounds like, from the outcome in Amplitude, it's not what was actually desired?
taking a look here https://analytics.amplitude.com/mozilla-corp/chart/x14djt1 we do see a drop in utm_source = email that's roughly equivalent to a corresponding increase in utm_source = (none) when entrypoint = preferences.
I think that eventually we do want to set utm_source for UI entrypoints that can lead to registrations, but I'd say that for now the behavior is as expected, and IMO this is not a bug.
While I'm here, a list of things I think we need to clarify before we can settle on what to do more generally with utm_source:
understand the precise difference between utm_source and entrypoint. currently more values are defined for entrypoint, i.e. firstrun and whatsnew are defined for both parameters but UI elements (prefs, menupanel) are only defined for entrypoint.
is utm_source only supposed to concern user acquisition?
I'm a little afraid of the work involved to do this, but if we really wanted utm_source to only represent where a user signed up from, then we would have a pretty clear definition for what the two parameters mean:
utm_source
originating source for a user's registration, ideally only set once and persists for that user forever. strictly a user property.
entrypoint
originating source for all other flows, including logins but also interactions with preferences etc. strictly an event property.
@davismtl said today that entrypoint
should track where a user enters the flow and utm_source
should track where they came from prior to entering. in most other cases, utm_source
is set to reference an external webpage that links to your page. in that case entrypoint
would naturally be the page they landed on after clicking the link.
I find it a bit difficult to square this logic with our current state of affairs since our sign-in/up forms are embedded directly on the pages that users visit, others are not linking directly to our forms or https://accounts.firefox.com from (for example) other blog posts or moz org pages. thus entrypoint
= utm_source
in nearly all cases. maybe that's ok, but it seems redundant and confusing.
the only place where the two parameters might not be redundant is UI routes to account settings, for example a user is interacting with the menupanel or synced-tabs (would be utm_source
given the logic above) and clicks on the button that takes them to account settings (would be entrypoint
from our point of view). however, as noted above, we only track entrypoint
for these UI elements, not utm_source
, because the parameterentrypoint
here is set by the browser to mean "where did the user enter settings from".
I'm just trying to be clear on the nomenclature. If we were to start over, personally I would stick with "entrypoint" only, given the lack of direct traditional links to our signin/up forms. users "enter" the flow at different "points" they are not (really) driven there by other sources from around the web. put another way, we are not hosting a traditional stand-alone "sign up for firefox accounts here" webpage that gets referral traffic from different webpages (even if its still really accounts.firefox.com behind the scenes).
But given that so many other teams are embedding our forms on their sites now, and that utm_source
is a much more familiar parameter, I'm not sure it would be realistic to just drop that and use entrypoint
only. thoughts?
utm_* documentation happening here https://github.com/mozilla/application-services/issues/135
@irrationalagent @davismtl are there clear next steps for this issue? Otherwise I think we should close it and open smaller/specific issues for any changes.
In #2505, we stopped setting
utm_source=email
on links in our outgoing emails because it clobbered the existingutm_source
for the flow and polluted our metrics.utm_medium
is still set in exactly the same way though and probably suffers from a similar problem:https://github.com/mozilla/fxa-auth-server/blob/e47b7102e680fd2361b14100a059025046acd178/lib/senders/email.js#L1095
There is also conditional logic for setting
utm_content
andutm_campaign
, which may or may not be fine.Let's audit these to make sure they're not messing up any metrics.
┆Issue is synchronized with this Jira Task ┆Issue Number: FXA-672