mdn / kuma

The project that powers MDN.
https://developer.mozilla.org
Mozilla Public License 2.0
1.93k stars 679 forks source link

Drop in GA traffic #5986

Closed atopal closed 5 years ago

atopal commented 5 years ago

Summary We are seeing a drop in GA after the react front-end launch which is inconsistent with other data sources (search console in this case)

Steps To Reproduce (STR)

  1. Go to GA
  2. compare traffic before and after the launch

How can we ensure that we are still collecting as much telemetry as before?

peterbe commented 5 years ago

How much is the drop roughly? You mentioned in Slack that the numbers from organic search (Google Search Console) were steady. So perhaps that can be a baseline.

atopal commented 5 years ago

Okay, I have a bit more than one week of data now, and it looks like we are collecting 20 percentage points less metrics data than before. The block rate jumped from about 43% to 65%. That means about 65% of MDN traffic is now largely invisible to us.

atopal commented 5 years ago

based on search traffic from search console we should be seeing a 20% y/y growth, instead we are seeing a 21% drop in users and 55% drop in page views in Google Analytics

peterbe commented 5 years ago

Here's the code on the new front-end:

    <script>
    // Mozilla DNT Helper
    /* This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. */ if(typeof Mozilla==='undefined'){var Mozilla={}}Mozilla.dntEnabled=function(dnt,ua){'use strict';var dntStatus=dnt||navigator.doNotTrack||window.doNotTrack||navigator.msDoNotTrack;var userAgent=ua||navigator.userAgent;var anomalousWinVersions=['Windows NT 6.1','Windows NT 6.2','Windows NT 6.3'];var fxMatch=userAgent.match(/Firefox\/(\d+)/);var ieRegEx=/MSIE|Trident/i;var isIE=ieRegEx.test(userAgent);var platform=userAgent.match(/Windows.+?(?=;)/g);if(isIE&&typeof Array.prototype.indexOf!=='function'){return false}else if(fxMatch&&parseInt(fxMatch[1],10)<32){dntStatus='Unspecified'}else if(isIE&&platform&&anomalousWinVersions.indexOf(platform.toString())!==-1){dntStatus='Unspecified'}else{dntStatus={'0':'Disabled','1':'Enabled'}[dntStatus]||'Unspecified'}return dntStatus==='Enabled'?true:false};
    // only load GA if DNT is not enabled
    if (Mozilla && !Mozilla.dntEnabled()) {
        (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
        m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
        })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

        ga('create', 'UA-36116321-5', 'mozilla.org');
        ga('set', 'anonymizeIp', true);

            // dimension1 == 'Signed-In"
            ga('set', 'dimension1', 'Yes');

            // dimension2 == "Beta Tester"

                ga('set', 'dimension2', 'Yes');

                // dimension18 == "Staff"
                ga('set', 'dimension18', 'Yes');

        // dimension9 == "Section editing"

            ga('set', 'dimension9', 'Enabled');

        (function() {
            // http://cfsimplicity.com/61/removing-analytics-clutter-from-campaign-urls
            var win = window;
            var removeUtms = function(){
                var location = win.location;
                if (location.href.indexOf('utm') != -1 && win.history.replaceState) {
                    win.history.replaceState({}, '', location.pathname);
                }
            };

            ga('send', 'pageview', {'hitCallback': removeUtms});
        })();
    }
</script>

(From: view-source:https://developer.mozilla.org/en-US/)

The wiki:

    <script>
    // Mozilla DNT Helper
    /* This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. */ if(typeof Mozilla==='undefined'){var Mozilla={}}Mozilla.dntEnabled=function(dnt,ua){'use strict';var dntStatus=dnt||navigator.doNotTrack||window.doNotTrack||navigator.msDoNotTrack;var userAgent=ua||navigator.userAgent;var anomalousWinVersions=['Windows NT 6.1','Windows NT 6.2','Windows NT 6.3'];var fxMatch=userAgent.match(/Firefox\/(\d+)/);var ieRegEx=/MSIE|Trident/i;var isIE=ieRegEx.test(userAgent);var platform=userAgent.match(/Windows.+?(?=;)/g);if(isIE&&typeof Array.prototype.indexOf!=='function'){return false}else if(fxMatch&&parseInt(fxMatch[1],10)<32){dntStatus='Unspecified'}else if(isIE&&platform&&anomalousWinVersions.indexOf(platform.toString())!==-1){dntStatus='Unspecified'}else{dntStatus={'0':'Disabled','1':'Enabled'}[dntStatus]||'Unspecified'}return dntStatus==='Enabled'?true:false};
    // only load GA if DNT is not enabled
    if (Mozilla && !Mozilla.dntEnabled()) {
        (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
        m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
        })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

        ga('create', 'UA-36116321-5', 'mozilla.org');
        ga('set', 'anonymizeIp', true);

            // dimension1 == 'Signed-In"
            ga('set', 'dimension1', 'Yes');

            // dimension2 == "Beta Tester"

                ga('set', 'dimension2', 'Yes');

                // dimension18 == "Staff"
                ga('set', 'dimension18', 'Yes');

        // dimension9 == "Section editing"

            ga('set', 'dimension9', 'Enabled');

        (function() {
            // http://cfsimplicity.com/61/removing-analytics-clutter-from-campaign-urls
            var win = window;
            var removeUtms = function(){
                var location = win.location;
                if (location.href.indexOf('utm') != -1 && win.history.replaceState) {
                    win.history.replaceState({}, '', location.pathname);
                }
            };

            ga('send', 'pageview', {'hitCallback': removeUtms});
        })();
    }
</script>

(From: view-source:https://wiki.developer.mozilla.org/en-US/)

No actual difference unless I fumbled when copy-n-pasting. Hmm...

peterbe commented 5 years ago

What's weird is this:

            // dimension1 == 'Signed-In"
            ga('set', 'dimension1', 'Yes');

That's from the new react front-end. You're never signed in! All HTML is anonymous but when we use the /api/v1/whoami in a post-load XHR request.

Here's the view-source if you use curl:

    <script>
    // Mozilla DNT Helper
    /* This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/. */ if(typeof Mozilla==='undefined'){var Mozilla={}}Mozilla.dntEnabled=function(dnt,ua){'use strict';var dntStatus=dnt||navigator.doNotTrack||window.doNotTrack||navigator.msDoNotTrack;var userAgent=ua||navigator.userAgent;var anomalousWinVersions=['Windows NT 6.1','Windows NT 6.2','Windows NT 6.3'];var fxMatch=userAgent.match(/Firefox\/(\d+)/);var ieRegEx=/MSIE|Trident/i;var isIE=ieRegEx.test(userAgent);var platform=userAgent.match(/Windows.+?(?=;)/g);if(isIE&&typeof Array.prototype.indexOf!=='function'){return false}else if(fxMatch&&parseInt(fxMatch[1],10)<32){dntStatus='Unspecified'}else if(isIE&&platform&&anomalousWinVersions.indexOf(platform.toString())!==-1){dntStatus='Unspecified'}else{dntStatus={'0':'Disabled','1':'Enabled'}[dntStatus]||'Unspecified'}return dntStatus==='Enabled'?true:false};
    // only load GA if DNT is not enabled
    if (Mozilla && !Mozilla.dntEnabled()) {
        (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
        (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
        m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
        })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');

        ga('create', 'UA-36116321-5', 'mozilla.org');
        ga('set', 'anonymizeIp', true);

        // dimension9 == "Section editing"

        (function() {
            // http://cfsimplicity.com/61/removing-analytics-clutter-from-campaign-urls
            var win = window;
            var removeUtms = function(){
                var location = win.location;
                if (location.href.indexOf('utm') != -1 && win.history.replaceState) {
                    win.history.replaceState({}, '', location.pathname);
                }
            };

            ga('send', 'pageview', {'hitCallback': removeUtms});
        })();
    }
</script>

and that's definitely different!

peterbe commented 5 years ago

@escattone I think there's something strange going on with the non-wiki. I got a different HTML when I used my browser compared to curl. Obviously developer.mozilla.org/api/v1/whoami should be different but not the HTML.

Also, is the CloudFront passing through cookies and stuff for non-API urls?

peterbe commented 5 years ago

Perhaps the GA tracking works differently for anonymous vs signed in. If a signed in person visits MDN and his/her HTML gets caught in the CDN, the next person who gets that warmed up CDN cache might get HTML as if she/he was logged in.

tobinmori commented 5 years ago

Set estimate to 1 for investigation. @escattone to add more issues for the fix.

escattone commented 5 years ago

@peterbe With the new CDN configuration that I rolled-out on stage/prod with https://github.com/mdn/infra/issues/291, the sessionid cookie is no longer forwarded for requests for document pages or the home page (it is of course still forwarded, as always, for /api/v1/whoami), so the HTML differences you saw in the GA script between logged-in and anonymous users will no longer occur (all users will look anonymous for initial page loads, as they should).

Here are my discoveries so far:

More to come.

peterbe commented 5 years ago

I "forked" the discussion to https://github.com/mdn/kuma/issues/6039

For the immediate future how about we focus on that ga('send', 'pageview' ... call?

escattone commented 5 years ago

Too many meetings today, but I had a bit of time to work on this. One thing that I'm wondering is if something changed with Firefox such that DNT is enabled by default now? That would prevent the ga function from being defined, which in turn, would cause the GAProvider React component to define the ga function as a "noop". Probably not, but it could be a factor.

Otherwise, I haven't see anything yet that's stands-out as something that could explain such a huge drop in pageviews, but I'll keep checking.

atopal commented 5 years ago

I had checked earlier, but there was no Firefox release during that time, the drop is exactly on the day we switched to the react-front-end

peterbe commented 5 years ago

Most of our traffic is Chrome and if the latest Chrome severely broke peoples' GA stats we'd hear an uproar. :)

peterbe commented 5 years ago

I believe https://github.com/mdn/kuma/pull/6047 fixes it so that the react home page and the react document pages share the same GA code which doesn't use request.user.

peterbe commented 5 years ago

The drop in users in GA is 1/3 but the drop in pageviews is 2/3. Not sure if that's a clue actually.

When we switched over to doing the ga('send', 'pageview', ...) in React, we now relied on making that send at the very end of the /api/v1/whoami XHR request: https://github.com/mdn/kuma/blob/b5557a17c8798dca12cc80e00d8484d24222b608/kuma/javascript/src/user-provider.jsx#L97 The /api/v1/whoami is notoriously slow. Roughly 0.5 seconds. Not only that but the XHR request isn't even started until after the react bundle has downloaded, parsed, executed, and lastly firing the effects. If you add that overhead with the /api/v1/whoami slowness, if any visitor is quick to close the tab or quick to click away on something else, they won't count in GA. It's still curious how this would the cause of firing almost 2/3 less pageviews. It is working after all in Chrome:

Screen Shot 2019-10-25 at 3 16 46 PM

And the script tag does get injected into the dom:

Screen Shot 2019-10-25 at 3 18 27 PM

Curiously none of this works in FirefoxNightly (Tracking Protection disabled for developer.mozilla.org):

Screen Shot 2019-10-25 at 3 19 21 PM

(See, nothing beyond developer.mozilla.org, speedcurve and github avatars) Also, in Firefox:

Screen Shot 2019-10-25 at 3 22 16 PM
peterbe commented 5 years ago

There is an exception happening the Browser Console which MIGHT be related to this domain:

Screen Shot 2019-10-25 at 3 26 26 PM

@escattone Believes it's related to this: https://github.com/mdn/kuma/blob/b5557a17c8798dca12cc80e00d8484d24222b608/kuma/javascript/src/user-provider.jsx#L107 It's different from how the Wiki did it: https://github.com/mdn/kuma/blob/b5557a17c8798dca12cc80e00d8484d24222b608/jinja2/includes/google_analytics.html#L61

peterbe commented 5 years ago

Oh! The reason why it's not working in Firefox is because Mozilla && !Mozilla.dntEnabled() evaluates to false. Even though I have Tracking Protection switched OFF. It's supposed to respect that:

Screen Shot 2019-10-25 at 3 44 48 PM
peterbe commented 5 years ago

If I replace the if (Mozilla && !Mozilla.dntEnabled()) { with if (true) { it appears to work better:

Screen Shot 2019-10-25 at 3 48 58 PM

That "HERE" console logging is from inside the hitCallback.

peterbe commented 5 years ago

Ok. So ignore everything I've said about things not working in Firefox. I'd an odd unicorn who disabled Tracking Protection for wiki.developer.mozilla.org and developer.mozilla.org. I suspect that Mozilla && !Mozilla.dntEnabled() is false for every Firefox user with a modern browser. Actually, I tried with my wife's Firefox 69 and Mozilla && !Mozilla.dntEnabled() is false too and that's the version before Tracking Protection was on my default (69). But perhaps I had enabled it manually on her laptop.

Either way, sadly, Firefox being weird or not would explain the massive drop because most people are using Chrome.

escattone commented 5 years ago

After banging my head against this all day, I'm still completely baffled why we're seeing a 60% drop in pageview hits in GA (see below, we launched the React-based front-end late on Oct. 9th):

image

@peterbe and I have not been able to find anything yet that explains such a radical drop.

When I explored this issue locally, I used the following set-up:

.env file

DEBUG=True
DOMAIN=mdn.localhost
ENABLE_RESTRICTIONS_BY_HOST=True
WIKI_HOST=wiki.mdn.localhost:8000
ATTACHMENT_HOST=demos:8000
SITE_URL=http://mdn.localhost:8000
STATIC_URL=http://mdn.localhost:8000/static/
ALLOW_ROBOTS_WEB_DOMAINS=mdn.localhost:8000

Added the following to the enviroment section of the worker service within my docker-compose.yml file (real GA account redacted)

version: '2.1'
services:
  worker: &worker
    ...
    environment:
      ...
      - GOOGLE_ANALYTICS_ACCOUNT=UA-xxxxxxxx-y

Modified the cookieDomain setting of ga('create', ...) within jinja2/includes/react_google_analytics.html to 'auto'

ga('create', '{{ settings.GOOGLE_ANALYTICS_ACCOUNT }}', 'auto');

That will give you a debug version of GA (analytics_debug.js) that will log all of the calls it makes. For example, within Firefox (latest version with Standard selected for my Browser Privacy setting within Privacy & Security) I see this within the dev-tools Console tab:

image

and this within Chrome:

image
tobinmori commented 5 years ago

This is great. Thanks for digging into this @escattone and for your help too @peterbe.

@atopal do you want us to continue digging in this? (it will of course, push other work out, since we originally alotted a day for this.)

atopal commented 5 years ago

@tobinmori yes, I don't see an alternative. We're mostly flying blind at this point.

atopal commented 5 years ago

Maybe it's helpful to know that Chrome user traffic dropped by 27% and Firefox traffic by 48% during the same time period

atopal commented 5 years ago

Safari and Edge roughly 35%. Again, no idea if this is relevant.

atopal commented 5 years ago

No geographic differences (except China, but that's probably due to Golden week)

escattone commented 5 years ago

I just wanted to document what @peterbe and I just discovered and suspect is the root cause of this issue. What we've discovered is that there's a race condition in the front-end. The race is between the definition of the ga function that gets started here and the hydration of the GAProvider here where the ga function is "locked-in" as the provided value. If the ga function has been defined by the time the hydration starts, we're fine. If not, the GAProvider will lock-in a "noop" function and provide that as the ga function to all of its child components.

joedarc commented 5 years ago

@escattone could it be because the GA <script> tag has [async]?

joedarc commented 5 years ago

@escattone to clarify, here it sets a.async=1. Wondering if removing that would resolve the race condition.

peterbe commented 5 years ago

@joedarc No, it's best to leave that one async. The flaw was that we were doing this:

  1. Inject <script src="https://www.google-analycis.js" async>
  2. Create GA context provider = has analytics loaded yet ? use it : noop.
  3. looong time passes whilst we're waiting for /api/v1/whoami
  4. Actually use the GA context provider.

What it needs to be is this:

  1. Inject <script src="https://www.google-analycis.js" async>
  2. looong time passes whilst we're waiting for /api/v1/whoami
  3. Create GA context provider = has analytics loaded yet ? use it : noop.
  4. Actually use the GA context provider.
joedarc commented 5 years ago

@peterbe Understood, makes sense. 👍

escattone commented 5 years ago

@joedarc @peterbe Hm, I'm starting to doubt this theory. The ga function is always defined as soon as this code runs, not after analytics.js is fully loaded. The ga function is designed to queue the requests until analytics.js is fully loaded.

escattone commented 5 years ago

Just documenting something I'm seeing.

This is the ga function when it works:

image

and this is the ga function when it doesn't:

image
escattone commented 5 years ago

@joedarc @peterbe Yeah, that theory is not quite right, since I always see this line run both when it fails and when it doesn't, but this issue still seems to be due to a race condition of some sort.

escattone commented 5 years ago

@joedarc @peterbe Here's my latest theory. It's a race condition between the GAProvider hydration and the completion of the load of analytics.js. When it fails, the GAProvider locks-in the ga function with the initial queueing version (i.e. function(){(i[r].q=i[r].q||[]).push(arguments)}) so that even after analytics.js loads and re-defines the ga function (to function(a){dd("Executing Google Analytics commands.");F(1);jf.H.apply(jf,[arguments]);ge()}), the GAProvider keeps on providing the old queuing version.

Here's an example of what I'm seeing consistently on failure:

image

When it works, I see that the GAProvider locks-in the final ga function, not the initial one:

image
escattone commented 5 years ago

The following "lazy" providence of the ga function via the GAProvider seems to fix it (I can't get it to fail any longer locally):

export default function GAProvider(props: {
     children: React.Node
 }): React.Node {
     let ga: GAFunction;

     ga = function(...args) {
         // If there is a window object that defines a ga() function, then
         // that ga function is the value we will provide. Otherwise we just
         // provide a dummy function that does nothing.
         if (typeof window === 'object' && typeof window.ga === 'function') {
             window.ga(...args);
         }
     }
     return <context.Provider value={ga}>{props.children}</context.Provider>;
 }
joedarc commented 5 years ago

FWIW, i used the above code from @escattone and was unable to get the error to occur locally as well after refreshing countless times.

peterbe commented 5 years ago

The following "lazy" providence of the ga function via the GAProvider seems to fix it (I can't get it to fail any longer locally):

That's what I have in my branch too. More or less exactly that. But I'm struggling with jest tests at the moment.