digital-analytics-program / gov-wide-code

Provides a set of javascript files and documentation to implement web analytics on US federal websites
http://www.digital.gov/dap
102 stars 54 forks source link

Pull in HTTPS version of GA code #11

Closed konklone closed 8 years ago

konklone commented 9 years ago

This line establishes the base GA snippet in the DAP and pulls in a "protocol-relative" URL for the main GA snippet:

})(window, document, 'script', '//www.google-analytics.com/analytics.js', 'ga');

It means that http:// pages will call an http:// URL, and https:// pages will call an https:// URL. This is the default URL that Google recommends in their traditional snippet. However, along with the changes we made in #8 to add the forceSSL option, I believe we should change it to always use https://:

})(window, document, 'script', 'https://www.google-analytics.com/analytics.js', 'ga');

I believe that forceSSL ensures that subsequent requests after fetching analytics.js happen over HTTPS, even for insecure sites. I also believe that it's in those subsequent requests that the user's actual telemetry data gets sent up to GA. I believe this change would only protect the fetching of the GA code.

So there's not a privacy leakage here when analytics.js is fetched from an insecure site (since the privacy leakage already occurred by the user simply visiting the insecure site). However, a) more HTTPS more better, but also b) the GA snippet is perhaps the single most requested URL on the entire internet, and so a pretty prime vector for attack for someone to rewrite in transit. While the insecure page itself could also be rewritten to fetch an insecure URL, it's just an easy change that adds an extra bit of protection.

So: no reason to panic, but also no reason not to do it. We've already made this change on 18f.gsa.gov and some other partner sites.

konklone commented 9 years ago

FWIW, I just evaluated my theory. I updated the DAP code on analytics.usa.gov to use the latest version, 1.03, that has the forceSSL flag set to true.

I tested it from a http:// URL:

dap-0

And verified that the initial loading of GA code was over http://:

dap-1

And that the subsequent collection step was over https://, as desired by #8:

dap-2

I have also observed that on other sites that recently implemented the DAP, such as on whitehouse.gov, but which did not use version 1.03 of the DAP code, their subsequent collection step is over http://.

In any case - resolving this issue with my suggested fix would make both of the above steps function over https://, even for insecure websites.

konklone commented 9 years ago

This is a privacy and security issue that I hope the DAP will address in its next release.

konklone commented 9 years ago

A note that this is still an issue in 2.0. From local development:

screenshot from 2015-04-30 17 27 58

Note that this is different from forceSSL, which is properly enabled, and forces an https:// connection for the collection step. The above is the initial connection to GA that downloads the JavaScript that then triggers the collection step.

However, a network that attacks or poisons google-analytics.com content still has an opening here. And it's a 6-character fix: https:.

konklone commented 8 years ago

Is there any update on when this six-character, all-upside fix will appear in the deployed DAP code?

tdlowden commented 8 years ago

Testing of the latest version of the code, which makes this fix, begins at the end of December. After assuring all runs well, the hosted code will be updated.

tdlowden commented 8 years ago

Fixed with v3.1!