alphagov / govuk_frontend_toolkit

❗️GOV.UK Frontend Toolkit is deprecated, and will only receive major bug fixes and security patches.
MIT License
403 stars 107 forks source link

Avoid Personally Identifiable Information (PII) being sent to Google Analytics (GA) #435

Closed h-lame closed 6 years ago

h-lame commented 6 years ago

For: https://trello.com/c/ilkmeS5E/239-investigate-preventing-pii-being-sent-to-ga

Our goal is to remove Personally Identifiable Information (PII) from any data we send to Google Analytics (GA). We strip email addresses (or things that look like them) by default and we can configure it to also strip postcodes (or things that look like them) too. We don't turn on postcodes by default because too many things (airplane call signs, form names, etc..) look like postcodes but aren't and would be stripped out as false positives. However it's configurable so that for the places we know where postcodes are used, we can turn on stripping them out.

h-lame commented 6 years ago

Does anyone ( @alphagov/gds-frontend-developers ?) have any opinions about this (and the implementation that makes use of it in static https://github.com/alphagov/static/pull/1183)?

We need to do something to stop sending this kind of PII to GA so we can avoid falling afoul of General Data Protection Regulation (GDPR) and this is just one possible approach. Do we need to make it more configurable and entirely off by default, or approach it entirely differently, or ... ?

selfthinker commented 6 years ago

It would be great if you could not use any acronyms or at least spell them out the first time you use them. I googled PII and GDPR and now know what it is now but I shouldn't have to do that. Even though I already knew what GA is, even that would be good to just spell out so that the most possible amount of people can understand what this Pull Request is about.

For mere mortals: They want to remove all kinds of Personally Identifiable Information (PII) from Google Analytics (GA) so that they can comply with the General Data Protection Regulation (GDPR).

I would also adjust those acronyms in the changelog.

h-lame commented 6 years ago

Good point @selfthinker - I've been staring at this for so long I've forgotten that others might not have that context. I'll update the PR shortly!

h-lame commented 6 years ago

Rebased - thanks for reviewing @boffbowsh!