gohugoio / hugo

The world’s fastest framework for building websites.
https://gohugo.io
Apache License 2.0
73.62k stars 7.39k forks source link

Improve Hugo Internal vs GDPR #4616

Closed onedrawingperday closed 6 years ago

onedrawingperday commented 6 years ago

Following the discussion at the forum: https://discourse.gohugo.io/t/hugo-vs-the-general-data-protection-regulations-gdpr-in-eu-eea/11526/12

I will focus on the Google Analytics internal template in this first post. But feel free to add suggestions about other templates such as Disqus, Twitter etc

Google Analytics

Notes

  1. The following suggestions are for Analytics.js that is used in the current internal template

  2. For Google Tag Manager things are more complicated and the user needs to set various settings directly in the Google Analytics dashboard. I'm not using Google Tag Manager myself and I cannot help with it.

  3. I am not some expert these suggestions are my own research and feel free to build on them to help @bep build the internal templates that will make Hugo offer GDPR compliant websites out of the box.


The general idea I am proposing is to make the Hugo GA internal template not collect any personally identifiable information so that it falls outside the scope of the GDPR. With such settings a Hugo site admin will not need user opt-in for Google Analytics cookies because no such cookies will be installed on a user's device. However as you may have guessed this will severely limit GA reporting. Namely no returning visitors etc. Also the owner of a GA property should have the User ID, Google Data Sharing and Advertising features disabled on their dashboard.

Note Enabling any of the above features or using the default Google Analytics code as is in the Hugo internal template requires user opt-in on the frontend. The tricky part is that you will need to have Google Analytics disabled until the user agrees to it. I am not going to cover this scenario here.

The key settings that need to change in the internal template are: Anonymize IP The GDPR treats IPs as personally identifiable information so the following should be enabled in the tracking code: ga('set', 'anonymizeIp', true); reference

Disable Cookies and Use Session Storage to Store the Client ID Session Storage is for the duration of a user's visit on a website. Even the Client ID is treated as Personally Identifiable Information in some quarters. There is no consensus about this currently. See a blog post about this here

var GA_SESSION_STORAGE_KEY = 'ga:clientId';
if (window.sessionStorage) {
  ga('create', 'UA-XXXXX-Y', {
    'storage': 'none',
    'clientId': sessionStorage.getItem(GA_SESSION_STORAGE_KEY)
  });
  ga(function(tracker) {
    sessionStorage.setItem(GA_SESSION_STORAGE_KEY, tracker.get('clientId'));
  });
}
else {
  ga('create', 'UA-XXXXX-Y', 'auto');
}
ga('send', 'pageview');

The above is a modified version of Google's sample code for using Local Storage to store the Client ID.

However Local Storage is persistent until cleared by the user and as such a Hugo site admin would still need user opt-in as Local Storage is treated the same as cookies.

There are caveats with the above setup below is the disclaimer by Google in the above page I linked to:

Note: unlike cookies, localStorage is bound by the same-origin policy. If parts of your site are on different subdomains, or if some pages use http and others pages use https, you cannot use localStorage to track users between those pages. For this reason, cookies continues to be the officially recommended way to store the Client ID.

Basically what the GDPR does is to limit Data Collection to the bare minimum. A setup as the one I propose above would make the Hugo GA internal template GDPR compliant but at the expense of GA reporting.

It's up to the community to decide how to go about this.

onedrawingperday commented 6 years ago

YouTube

Google is offering video embedding under the domain youtube-nocookie.com so that cookies are not set on a user's device before he/she presses the play button.

Hugo's current internal YouTube template needs to change to the following:

t.addInternalShortcode("youtube.html", `{{ if .IsNamedParams }}
<div {{ if .Get "class" }}class="{{ .Get "class" }}"{{ else }}style="position: relative; padding-bottom: 56.25%; padding-top: 30px; height: 0; overflow: hidden;"{{ end }}>
  <iframe src="//www.youtube-nocookie.com/embed/{{ .Get "id" }}?{{ with .Get "autoplay" }}{{ if eq . "true" }}autoplay=1{{ end }}{{ end }}"
  {{ if not (.Get "class") }}style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" {{ end }}allowfullscreen frameborder="0" title="YouTube Video"></iframe>
</div>{{ else }}
<div {{ if len .Params | eq 2 }}class="{{ .Get 1 }}"{{ else }}style="position: relative; padding-bottom: 56.25%; padding-top: 30px; height: 0; overflow: hidden;"{{ end }}>
  <iframe src="//www.youtube-nocookie.com/embed/{{ .Get 0 }}" {{ if len .Params | eq 1 }}style="position: absolute; top: 0; left: 0; width: 100%; height: 100%;" {{ end }}allowfullscreen frameborder="0" title="YouTube Video"></iframe>
 </div>
{{ end }}`)

Basically the change is from www.youtube.com to www.youtube-nocookie.com

onedrawingperday commented 6 years ago

A clarification about the YouTube-no cookie above.

YouTube offers this "privacy enhanced" option since 2009 and it will not set cookies until a user clicks to play a video, or so it claims.

I discovered that there’s been some controversy about it in the past.

Unless Google begins offering a better privacy embed solution it’s all we've got.

onedrawingperday commented 6 years ago

Modified Google Analytics. snippet. Version 2.

The snippet I posted above checks whether Session Storage is available in the client's browser and if it's not it creates the Google Analytics cookies the old way.

That's not allowed under the GDPR without opt-in. So I have modified the snippet to use Session Storage only. Of course this will only work on browsers that support Session Storage see: https://caniuse.com/#search=sessionstorage

But then again I don't think Hugo should support non modern browsers.

Here is the final GDPR compliant snippet tested and working.

    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
    m=s.getElementsByTagName(o)
    [0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

var GA_SESSION_STORAGE_KEY = 'ga:clientId';

  ga('create', 'UA-XXXXX-Y', {
    'storage': 'none',
    'clientId': sessionStorage.getItem(GA_SESSION_STORAGE_KEY)
  });
  ga(function(tracker) {
    sessionStorage.setItem(GA_SESSION_STORAGE_KEY, tracker.get('clientId'));
  });
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');

Of course Google might come up with a better GDPR compliant offering by the 25th of May but this is the best solution I've come up with and it's compliant because it doesn't set persistent cookies or Local Storage on a visitor's device.

it-gro commented 6 years ago

This is about Disqus.com and the internal disqus template hugo/tpl/tplimpl/template_embedded.go

I'm not a layer - but browsing to https://disqus.com/admin/ I see user comments (of the site I'm the admin) including the users nickname, the email address the user used for registration and the IP-address. Question is, if this is GDPR relevant? Processing personal data. https://help.disqus.com/moderation/moderating-101

Anyway - here's a proposal - Please comment.

{{ if .Site.DisqusShortname }}<div id="disqus_thread"></div>
<script>
    var disqus_config = function () {
    {{with .GetParam "disqus_identifier" }}this.page.identifier = '{{ . }}';{{end}}
    {{with .GetParam "disqus_title" }}this.page.title = '{{ . }}';{{end}}
    {{with .GetParam "disqus_url" }}this.page.url = '{{ . | html  }}';{{end}}
    };
    function disqusAgree(){
      localStorage.setItem("agreed_to_disqus_thread", "YES");
      localStorage.setItem("agreed_to_disqus_thread_date", (new Date()).toLocaleString() );
      location.reload();
    };
    (function() {
        if (["localhost", "127.0.0.1"].indexOf(window.location.hostname) != -1) {
            document.getElementById('disqus_thread').innerHTML = 'Disqus comments not available by default when the website is previewed locally.';
            return;
        }
      {{- if ne ($.Param "disqusSkipAgree") true }}
        if ((localStorage.getItem("agreed_to_disqus_thread") != "YES") ) {
          document.getElementById('disqus_thread').innerHTML = '{{ (default `Show comments powered by [disqus.com](https://disqus.com)` (i18n `disqusTxtAgree`) ) | markdownify }} <button id="agree-to-disqus" type="button" onclick="disqusAgree()">{{default `Show me` (i18n `disqusBtnAgree`)}}</button>';
          return;
        }
      {{- end }}
        var d = document, s = d.createElement('script'); s.async = true;
        s.src = '//' + {{ .Site.DisqusShortname }} + '.disqus.com/embed.js';
        s.setAttribute('data-timestamp', +new Date());
        (d.head || d.body).appendChild(s);
    })();
</script>
<noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript>
{{end}}

The part's I added:

      {{- if ne ($.Param "disqusSkipAgree") true }}
        if ((localStorage.getItem("agreed_to_disqus_thread") != "YES") ) {
          document.getElementById('disqus_thread').innerHTML = '{{ (default `Show comments powered by [disqus.com](https://disqus.com)` (i18n `disqusTxtAgree`) ) | markdownify }} <button id="agree-to-disqus" type="button" onclick="disqusAgree()">{{default `Show me` (i18n `disqusBtnAgree`)}}</button>';
          return;
        }
      {{- end }}

By setting

[params]
  disqusSkipAgree = true

The new behaviour is turned off.

If not set or <> true A visitor first has to press a button in order to see the comments. The browsers local storage is used to remember his "agreement".

I think the text should be changeable. => Text besides the button:

(default `Show comments powered by [disqus.com](https://disqus.com)` (i18n `disqusTxtAgree`) ) | markdownify

The button itself

default `Show me` (i18n `disqusBtnAgree`)

This way (if i18n is accessible inside a internal template?) users can modify the text:

i18n/en.yaml

- id: "disqusTxtAgree"
  translation: "Show comments powered by [disqus.com](https://disqus.com). Agree to the [Terms](/terms)"

- id: "disqusBtnAgree"
  translation: "Yes, I Agree"

(/terms would be the reference to the site terms, needed by GDPR)

    function disqusAgree(){
      localStorage.setItem("agreed_to_disqus_thread", "YES");
      localStorage.setItem("agreed_to_disqus_thread_date", (new Date()).toLocaleString() );
      location.reload();
    };

This is the function called by clicking the button. It stores a YES and a readable time stamp. If once pressed, the button will not show again (until the local storage is cleared).

What do you think?

Edit: See here an example

onedrawingperday commented 6 years ago

@it-gro The tricky part in seeking user consent as lawful basis under the GDPR is that we have to avoid making consent to processing a precondition of a service. See this guidance from the UK Information Commissioner Office (ICO): https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/lawful-basis-for-processing/consent/

I don't know if it's technically possible but I sincerely think that it would be best to show the Disqus Comments as plain text under each article and then move the Consent you developed at the bottom if a user wants to comment using Disqus.

brunoamaral commented 6 years ago

@onedrawingperday I'm using tag manager on a number of sites. What information have you gathered and how can I help?

onedrawingperday commented 6 years ago

@brunoamaral I don’t use Google Tag Manager for Google Analytics but see this post for all the tweaks that someone suggests: https://www.humix.be/en/blog/configure-google-analytics-for-gdpr/

Under the GDPR whenever an identifier is stored in a user’s device in the form of persistent cookies or local storage or the recording of an IP address etc a website is required by the new regulation to offer user opt-in. This means that tracking needs to be disabled by default and enabled only if a user agrees to it.

In my above proposal I have tried to make Google Analytics fall outside the scope of the GDPR by not storing anything persistent on a user’s device and moving the GA client ID to session storage.

If you can come up with something similar for Google Tag Manager (that is now the default Analytics tracking code), YouTube or any other of Hugo’s internal templates for third party services that would help a lot.

brunoamaral commented 6 years ago

Alright, then just to sum up what I know and share with others:

The default install of google tag manager (GTM) will only fire events to Google Analytics or one of it's other tracking apps. It does not store information, and if the Google Analytics implementation is done in the way you describe above, the admin of the site can rest assured.

The problem comes up when you use google tagmanager itself to fire the Google Analytics tag, in this way: image

That {{Universal Analytics ID}} is a variable that sets options for the GA script. Here is the implementation I use at brunoamaral.eu:

image

I'm using the same anonymizeIp option you suggested, but so far have not found an equivalent to the other options you set.

In short, if the admin is using GTM as a one-stop-shop for events and setup of the Universal Analytics Code, the compliance with GDPR falls on his shoulders. He needs to configure it on the GTM workspace. The default code for GTM does not break GDRP, as far as I know.

I will keep looking into this and report back if I find anything relevant.

ghost commented 6 years ago

This is really useful. My partner is going to a conference on the new directives this week.

If there is a hugo sample template repo being setup to show how Hugo should be done I will be happy to contribute all the aspects we learn and code that is relevant.

onedrawingperday commented 6 years ago

@gedw99 You can view the source of Hugo's current internal templates for third party services & other things at: https://github.com/gohugoio/hugo/blob/master/tpl/tplimpl/template_embedded.go

ghost commented 6 years ago

@onedrawingperday i never knew hugo has a default template that it uses. So if you dont give it any theme to use is uses this ?

Wont we break everyone if we change this though ?

brunoamaral commented 6 years ago

@gedw99 these internal templates need to explicit on the theme in order to be used.

@onedrawingperday I have been doing some research and am wondering about the approach of using LocalStorage. It's still identifiable information, and may still be in the scope of GDPR. Question is if we should wait for the user to agree to cookies before activating google analytics or not.

onedrawingperday commented 6 years ago

@gedw99 These are the Hugo internal templates for: Disqus, Google Analytics, Open Graph, Schema, Twitter Cards, See: https://gohugo.io/templates/internal/#the-internal-templates

There are also internal shortcodes for YouTube, Vimeo, Instagram, Twitter, Speacker Deck See: https://gohugo.io/content-management/shortcodes/#use-hugo-s-built-in-shortcodes

Wont we break everyone if we change this though ?

@bep asked me to open this issue to make the Hugo internal templates GDPR compliant. When the regulation is enforced come May the 25th, Hugo site admins will have to seek user consent as legal basis for the data collection of the 3rd party services that are enabled by these internal Hugo templates. Otherwise they will be breaking the new regulation and might face heavy fines. However this does not apply only to EU and EEA based websites but also to every web platform that has a European audience.

It's up to the Hugo community to decide what they want to do. Make Hugo GDPR compliant and break some eggs or leave things as they stand now and leave Hugo users fend for themselves.

onedrawingperday commented 6 years ago

@brunoamaral Local Storage is persistent Session Storage is not. In the GA snippet I posted above I have moved everything into Session Storage, so that once the browser tab is shut there will be no GA identifier in a user's device. Also the IP address of a visitor will be masked and no personal information is transmitted.

I also think that it would be prudent to have a notice asking users to agree to a website's privacy policy. See the new GDPR notice of https://www.nike.com

Maybe Google will start offering a GDPR compliant tracking code for Google Analytics (but I very much doubt it).

bep commented 6 years ago

Just want to chime and say that I have not forgotten about this, just have been ... kind of busy on other fronts.

it-gro commented 6 years ago

@onedrawingperday avoid making consent to processing a precondition of a service. It is my understanding that the the "service" is the static page build using hugo. Not agreeing to the hugo site "discus terms" does not stop this service (showing pages and posts on the site) for the visitor. But it affects the ability to see or make any comments (using discus.com) which is not a service of the hugo site (but from discus.com). GDPR compliant agreement for the site is needed since the site-admin can see / process visitors personal data on the discus.com admin site (see discourse.gohugo.io).

onedrawingperday commented 6 years ago

@it-gro Of course the GDPR is open to interpretation at this stage. It's up to you to decide what you want to do.

In my humble opinion comments are an essential part of a blog. Users shouldn't have to accept Disqus tracking to view them.

You will need to inform them with a notice. So that if they do not wish to have the Disqus tracking on their devices comments are disabled.

onedrawingperday commented 6 years ago

Ah! I accidentally closed the issue and reopened... pressed the wrong button on the phone. Apologies.

onedrawingperday commented 6 years ago

@jhabdas Thanks for the heads up. I've made a few tests but I cannot see this script being loaded on my end with the youtube-nookie iframe.

Can you please tell me what you see on your end? Where is the script loaded? In the parent DOM or within the iframe? (checked both and the console and I couldn't find it). Also does it load before or after video playback?

Thanks

TotallyInformation commented 6 years ago

I have received an update from Google regarding GDPR compliance and the use of GA. I can post here if required.

They have added granular data retention controls and user deletion. They have also updated their user consent policy.

it-gro commented 6 years ago

@onedrawingperday

You will need to inform them with a notice. So that if they do not wish to have the Disqus tracking on their devices comments are disabled.

Which may be done via this:

i18n/en.yaml

- id: "disqusTxtAgree"
  translation: "Show comments powered by [disqus.com](https://disqus.com). Agree to the [Terms](/terms)"

(see Example)

ghost commented 6 years ago

Nice. Could you post the message from google. Curious to learn for other project.

On Fri, 20 Apr 2018 at 10:38 it-gro notifications@github.com wrote:

@onedrawingperday https://github.com/onedrawingperday

You will need to inform them with a notice. So that if they do not wish to have the Disqus tracking on their devices comments are disabled.

Which may be done via this:

i18n/en.yaml

  • id: "disqusTxtAgree" translation: "Show comments powered by disqus.com. Agree to the Terms"

(see Example https://it-gro.github.io/hugo-theme-w3css-basic.github.io/blog/2017/11/10/hugo-dolor/ )

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gohugoio/hugo/issues/4616#issuecomment-383026896, or mute the thread https://github.com/notifications/unsubscribe-auth/ATuCwvfkwTdWQD_BIwHnLnF1QTErOggJks5tqZ5xgaJpZM4TU9d_ .

onedrawingperday commented 6 years ago

Instagram

Summary The default Instagram oEmbed endpoint -as used in the current Hugo internal shortcode- injects persistent cookies and local storage (that cannot be blocked) through the Instagram embeds.js JavaScript library.

To make the shortcode GDPR compliant -and by that I mean disable Instagram's user tracking- I propose to re-construct the HTML structure of the Instagram embed through the shortcode.

There are a number of caveats with this approach:

  1. The oEmbed endpoint JSON as seen for example here does not contain user data keys. That means that the hover card with user posts, number of followers, the photo likes and the user avatar are simply out of reach.

  2. The Instagram API is going to be deprecated completely by 2020 and will be replaced by the Instagram (Facebook) Graph API that currently is open only for business. I am not currently aware whether they're going to eliminate the oEmbed endpoint.


Proposal The input of the internal Instagram shortcode will be kept as is: {{< instagram BWNjjyYFxVx hidecaption >}}

But everything within it will be completely different: See Gist here.

Basically I am ranging through the JSON keys that are available through the Instagram oEmbed endpoint.

To get the user avatar I am proposing the following: {{ $.Page.Params.inst_avatar | default $.Site.Params.inst_avatar }} this means that users who wish to use the internal shortcode will need to declare the address of the User Avatar either on their site's config or in the frontmatter of a page. (Obviously this is for embeds from one Instagram user per page).

To get the Instagram embed styling I have modified their external stylesheet because it is meant for an iframe and it contains generic CSS rules that will conflict with any Hugo theme out there. See the Gist with the modified stylesheet.

I am proposing to load this custom stylesheet conditionally in the <head> tag like so: {{ with .HasShortcode "instagram" }}<link rel="stylesheet" href="/inst.css" />{{ end }}

Note that within the CSS I am calling the Instagram controls sprite that lives in this address as sprite.png and I am proposing to serve it from /static/.

I have eliminated the likes count from the proposed internal shortcode because as I said above there is no way to get them through the oEmbed endpoint.

When constructing the Instagram embed URL I have appended the following: "/&amp;maxwidth=640&amp;omitscript=true" because 640 pixels is the maximum thumbnail width that is available and also I want to omit Instagram's Javascript Libraries.

To extract the Instagram photo timestamp from Instagram's html JSON key I had to resort to some RegEx. See the Gist here.

In order to render the time difference since an Instagram photo post was created -in the form of the default Instagram embed- I am proposing the use of timeago.js a tiny 2kb library.

Within the shortcode I've created the following structure: <span class="ago" datetime="{{ $date }}"></span>

And then in the <footer> tag of a page I am conditionally loading timeago.js and rendering the timestamp like this:

{{ with .HasShortcode "instagram" }}<script src="/timeago.min.js" type="text/javascript"><script>timeago().render(document.querySelectorAll('.ago'));</script>{{ end }}


Notes The above offers a GDPR compliant privacy enhanced Instagram embed. But this technique can be applied in the Hugo internal shortcodes and templates of other 3rd party services that have a JSON API like Disqus and Twitter.

The downside is that this is a time consuming approach and also that Hugo users will need to specify additional parameters, CSS and JS for the internal shortcodes and templates to function.

CC / @bep @it-gro

bep commented 6 years ago

@onedrawingperday I have added you as a collaborator to this repo, if you don't see an invite, let me know.

I have created a branch we can work on:

https://github.com/gohugoio/hugo/tree/GDPR

Once we're happy with the content of that, we merge it into master and make a release.

This is a working branch, so amend as you please, but I suggest that we -- to illustrate that we have put some sweat and hard work into this -- merge the commits as they are (not so much squashing...), so write some sensible commit messages ... with imperative mood ...

My first commit in that branch is how I see the configuration.

baseURL = "https://example.com

[privacy]
[privacy.youtube]
noCookie = true

The above can be done per language, and the idea is to add new sections/flags as needed where the default when not set is off.

We should add a "GPPR table" in the docs somewhere (probably its own page) where we list all of these.

Note, that I have not used the above setting anywhere, but in a template:

{{ if .Site.PrivacyConfig.YouTube.NoCookie }}

{{ end }}

Will work. And as the above setting will be available to end users, we should take a little care about naming etc. As it will be set in stone, more or less.

Sounds like a plan?

onedrawingperday commented 6 years ago

@bep I've received the invite, accepted it and I'll try to make the commits as terse and as explanatory as possible.

Sounds like a plan?

Yup! Pretty much. So that whenever .Site.PrivacyConfig.<service> is set the GDPR versions of the internal templates and shortcodes will be enabled.

That's very cool and backwards compatible.

I need some feedback though regarding my proposal for the Instagram shortcode. I have a stylesheet that needs to be included and also proposed the use of timeago.js.

My idea is to rebuild the HTML of the embeds (Instagram, Twitter) without the tracking. There are features missing from the public oEmbed API most notably the likes count.

I very much doubt that a user with an Access Token for the official APIs would have their app approved for bypassing the official embeds that have the tracking

Is this is a sensible approach? Do you agree with it? If yes what would be the most efficient way to include the stylesheet and timeago.js?

Also as I said above the Instagram API is in a transition period and we might need to revisit this in the future.

bep commented 6 years ago

On an added note: The internal templates are all "hardcoded", i.e. inline, which makes them harder to edit than needed.

I can pull them out into files? (I even think there is an issue for that)

kaushalmodi commented 6 years ago

I can pull them out into files? (I even think there is an issue for that)

Yes 😃

bep commented 6 years ago

... and now figure out how to embed files in a Go binary ... So much I don't know.

bep commented 6 years ago

It would probably be smart to wait with the template updates until I finish this:

https://github.com/gohugoio/hugo/pull/4700

Should not take too long.

onedrawingperday commented 6 years ago

Yes I've noticed. Also #4700 will perhaps make it possible to include the assets that will be needed for the new templates.

bep commented 6 years ago

What kind of assets is that? Note that my PR is all about simpler management/administration/editing, it does not add any "asset bundling".

onedrawingperday commented 6 years ago

What kind of assets is that? Note that my PR is all about simpler management/administration/editing, it does not add any "asset bundling".

Ok.

@bep Whenever you have the time please review my work for the Instagram shortcode. See my comment above and the Instagram proposal here.

I really need to know what you think.

And by assets I meant the CSS and JS that need to go into the Instagram & Twitter shortcodes and if you want to implement a consent for the YouTube shortcode like this one that will also need some CSS and JS.

bep commented 6 years ago

I think our ambitions are a little misaligned and I will have to think hard about it. In general, I expect the services that we use to have privacy options in their APIs to turn stuff off. And I also expect them to have the consent forms etc. I spend enough hours on this project as is, and I'm not too keen on adding forms and user interface to the simple shortcodes. That will also lure us into a territory where we must take special care about the implementation, browser testing etc. All in all something that we are not manpowered to do.

onedrawingperday commented 6 years ago

I expect the services that we use to have privacy options in their APIs to turn stuff off

They don't. I've spent a long time on this. You can see with your own eyes.

Unless of course you have some trick up your sleeve to disable the user tracking with the default embeds I see no other way of going about it.

@it-gro has also done a great job with the Disqus shortcode. His approach is to disable everything -including the comments- until a user consents.

EDIT @bep Just to give you an idea of what the current situation is here is a screenshot from the Hugo page with the YouTube video

cookies

Note the Consent Value that is set to YES on page load. The YouTube API only generates this default embed. The youtube-nocookie domain is not available through the official API only through the oEmbed endpoint. Still the youtube-nocookie embed sets the CONSENT value to YES in local storage silently as soon as a user presses the play button. That's why DuckDuckGo have made that consent screen.

The Instagram and Twitter official APIs are more or of the same. They generate embeds with tracking. And Vimeo from what I've seen does not offer a cookie-less embed like YouTube does.

It's their business model. If they planned to change anything for the GDPR they would have done something by now.

Anyway I understand your concerns. To be honest I was a bit shocked when you first opened the Discourse GDPR thread because I've been working on this since December and I know that turning off the 3rd party tracking is not a trivial thing to implement.

It's your call and I understand that you may not want to go down this road. But if you choose to do this I -for one- will be working hard on this.

bep commented 6 years ago

Unless of course you have some trick up your sleeve to disable the user tracking with the default embeds I see no other way of going about it.

I will look into this in detail during the weekend. But we are a static provider. I don't see how we possibly can have a consent screen on behalf of Google. That does not make sense on any level. Esp. since that concent is only stored in local storage (where is the tracking/documentation of that?)

So the trick up the sleave for the services that do not provide an API for this is, as I currently see it, this:

{{ if not .Site.Privacy.YouTube.Disabled }}
Show youtube

{{ end }}

That would effectively block any youtube shortcode. Which would be GDPR compliant.

bep commented 6 years ago

@spf13 what is Google's take on this?

onedrawingperday commented 6 years ago

I don't see how we possibly can have a consent screen on behalf of Google. That does not make sense on any level.

This is the situation with Google, publishers & the GDPR consent currently. There is no guidance for smaller sites we're too small. Anyway the bottom line is that Google puts the onus on us, site admins that is. The EU has noticed this and they're not happy

There is more coming from the EU with the ePrivacy Regulation (another link if you care) in its current draft local storage is treated the same as cookies and needs consent. The only version of HTML5 storage that does not require consent is Session storage.

About disabling the 3rd party services that's something that people have already started doing. See here if you want.

At the end of the day disabling the services is the option with the least "overhead" for the Hugo project.

onedrawingperday commented 6 years ago

@jhabdas

Thanks for your input.

Rather than overreaching to try and protect people (leaving them essentially stupid) it would be better IMHO to give them the tools and knowledge necessary to make their own choices. In the context of GDPR that means stripping out anything related to Google, or YouTube, or whatever, and leaving a cookbook along with some caveats for those who want to take the plunge.

The cookbook seems unlikely. It wouldn't be proper to include a resource for GDPR compliant privacy practices in the Hugo Docs.

it-gro commented 6 years ago

I just committed my proposal for the disqus template in the GDPR branch: 88e2283 Here's a demo

Visitors who did not yet agree with the hugo site to process private data (using disqus admin page) will see:

grafik

i18n may be used to change or translate the text in the template: grafik

- id: "disqusTxtAgree"
  translation: "Zeige Kommentare via [disqus.com](https://disqus.com). Aber erst müssen Sie den [AGBs](https://help.disqus.com/terms-and-policies/terms-of-service) zustimmen"
- id: "disqusBtnAgree"
  translation: "Ja, ich bin einverstanden"

After agreeing, the visitors will see no difference: grafik

But in their local browser storage there will be: grafik

As we are static - this is where the agreement is stored - on the visitors side.

This agree button is enabled by default. Hugo projects may disable it (and get the old behavior) via:

[privacy]
[privacy.disqus]
skipAgree = true
bep commented 6 years ago

There are lots of stuff here, so let us take the most important part:

Is a consent form that shows some text with a yes/no that stores that flag into local storage good enough according to GDPR?

If yes, we should consider

But we cannot build something like this per template/shortcode. That will not be way too much work.

But first someone must dig into the consent part and figure out if

bep commented 6 years ago

Not sure how official this is, but it is clear:

https://gdpr-info.eu/art-7-gdpr/

the controller shall be able to demonstrate that the data subject has consented to processing of his or her personal data.

And

The data subject shall have the right to withdraw his or her consent at any time.

Which doesn't work with what I read above.

onedrawingperday commented 6 years ago

@bep Yes lawful consent under the GDPR has certain implications

Make it easy for people to withdraw consent and tell them how. Keep evidence of consent – who, when, how, and what you told people.

Also see the UK Information Commisioner's Office page

That is why my friend I have tried to make the embeds NOT to require consent by:

  1. Putting the Google Analytics Client ID cookie to session storage.
  2. Rebuild the HTML of the Instagram embed from scratch with no tracking.
  3. Changed the YouTube embed to nocookie so that Google does not set its goodies on a user's device before she clicks play. (a minimal notice will be needed or make the shortcode play the video directly on YouTube's site, that is turn it into a fancy link with the video thumb and a play button that opens in a new window)

If you decide to go down the CONSENT shortcode road we need to provide some kind of form so that a Hugo site admin can collect the user consent for their records.

Also we might need another WITHDRAW CONSENT shortcode that clears all Local Storage from a Hugo site. So that if a visitor wants to withdraw consent they can do it.

Both these shortcodes CONSENT and WITHDRAW require JS.

Obviously the GDPR is no fun.

But if you prefer to go with the simplest solution and that is to offer Hugo site admins the privacy option to disable all 3rd party services then what you've already posted will be GDPR compliant

{{ if not .Site.Privacy.YouTube.Disabled }}
Show youtube

{{ end }}

But then again people will either use this and turn off content they need from these 3rd party services or fend for themselves and try to work it out on their own.

The decision is yours.

it-gro commented 6 years ago

the controller shall be able to demonstrate that the data subject has consented to processing of his or her personal data.

If the skipAgree config is removed, I can demonstrate that there was a visitor consent via code review/audit. (No way to go ahead without "pressing the button").

The data subject shall have the right to withdraw his or her consent at any time.

Yes - this does not work - if it means that after withdrawing the personal data shall be actively removed. This is affecting the personal data where the hugo site admin is the controller. In the case of disqus the hugo site admin is the controller (since he is processing the personal data on the disqus systems).

The data subject shall have the right to withdraw his or her consent at any time.

But I read this this way: After withdrawing no new personal data shall be processed. So no need to remove the personal data history since for this history the consent was given. For Disqus:

But I see no way how to remove the personal data (IP, email) on the disqus site as an admin of a hugo site using disqus - except removing the comments of the withdrawing visitor. But as a hugo site admin - I will have no knowledge of his withdrawing...

onedrawingperday commented 6 years ago

@it-gro Time to migrate from Disqus.

bep commented 6 years ago

I think we need to take this in steps.

Could someone, in table form, list the services and their related config options? When listing those, think that it should be flags that should be "turned on" (default = false). And the default should be the most useful value (not the most private value).

If Disqus is the only one that requires a consent form to be somewhat useful, I'm tempted to wait with that (because it is a lot of work to get right), and add this setting to all services:

[disqus]
disabled = true

Disqus should really step up to the plate and ... fix this.

onedrawingperday commented 6 years ago

@bep Here is a tentative table.

[privacy]
[privacy.googleAnalytics]
disabled = false

[privacy.disqus]
disabled = true

[privacy.youtube]
disabled = false 
noCookie = true

[privacy.vimeo]
disabled = false

[privacy.twitter]
disabled = false
avatar = "" # URL string

[privacy.instagram]
disabled = false
avatar = "" # URL string

[privacy.speakerdeck]
disabled = false

Now what each of these settings does depends upon what you have in mind.

I'm not familiar with SpeakerDeck. They set 1 cookie that expires about 30 minutes after a user visit and another Session Cookie.

Twitter and Vimeo both have oEmbed APIs

If you want to see live DEMOS of what I've done with Instagram and YouTube just ask and I'll put them online.

PS. The avatar setting under Instagram & Twitter is needed only if you go for the custom embeds I proposed earlier.

bep commented 6 years ago

@it-gro note that I will think really hard about what to do about Disqus and consent etc. Your code looks clean and good, but I think we need to think really hard about what we want to maintain as a project. Any UI code is "only if we really have to".

@onedrawingperday thanks for the overview. I will follow up on this tomorrow.

it-gro commented 6 years ago

but I think we need to think really hard about what we want to maintain as a project.

Yes, I agree.

One (extreme?) option could be to remove things like Disqus from the hugo core. A repo with templates, partials, shortcodes could be better. There the theme or project authors could "steal" from. They would be in charge for any improvements - and they could share them. Kind of templates / partials / shortcodes gallery.

But of course since currently there are such internal templates / shortcodes we just can't drop them. A freeze (=doing and breaking nothing) and later on a "deprecated" message on stdout could be done. So "pushing" the users to the new and fancy versions in a separate repo.

I found 64 themes (in https://github.com/gohugoio/hugoThemes) currently using _internal/disqus.html - but of course there are many more in the "wild".

In a user "templates / partials / shortcodes gallery" we probably would see some toolkit specific stuff (css classes, ...) (=> tagging, categorising, ... would be very important). There's a risk that we will end up in "tons of variations" of the same thing (e.g. disqus).

Such a repo is not a new idea: discourse shoud-we-add-internal-shortcodes-for-popular-services discourse creating-new-internal-short-codes

I'm perfectly fine if we don't use anything of my proposal. It may be a starting point - at least for some discussions...

onedrawingperday commented 6 years ago

@bep I forgot to say that in the above table I haven't included options for the Gist internal shortcode because I see no tracking from GitHub's part in pages with embedded gists.

GitHub is pretty cool!👍

onedrawingperday commented 6 years ago

I've looked a bit more into the Speaker Deck embed. (Didn't know that it's owned by GitHub).

The bad news is ~that I cannot find an official API for it.~ The embed contains the old Google Analytics tracking code ga.js Also it contains jQuery and all sorts of stuff from various CDNs.

I don't think there is much to do about it. It's either building a consent form for Speaker Deck's data collection ~or spend hours trying to rebuild the embed without the tracking.~

I don't use Speaker Deck so I will not look into it further.

EDIT They offer an oEmbed endpoint (link)

See here for a sample response.

But that is completely useless as it only calls the iframe that contains the presentation with all the user tracking.