OpenTreeOfLife / opentree

Opentree browsing and curation web site. For overarching or cross-repo concerns, please see the 'germinator' repo.
http://tree.opentreeoflife.org/
BSD 2-Clause "Simplified" License
108 stars 26 forks source link

GDPR compliance #1201

Open jimallman opened 5 years ago

jimallman commented 5 years ago

I'm working on a set of changes to catch up with the privacy protections mandated by the GDPR. This will include

On the last point, I'm not sure how we'd implement this given Github's data model. It sort of implies that we'd need to rewrite repo history and repeat their contributions as the API user.

jimallman commented 5 years ago

I suppose we also need to provide clear procedures for our users to revoke permission or request a purge of their personal data. At the very least, a contact email and standard procedures for satisfying these requests.

Note also that our planned use of Google Analytics may require additional action.

jar398 commented 5 years ago

Why does open tree care about GDPR? Do US sites get cut off in Europe for noncompliance?

jimallman commented 5 years ago

As I understand it, we provide unpaid services to EU residents and are therefore subject to GDPR. It seems the penalties are fines and lawsuit exposure, not being blocked per se.

The subject has come up in a few of our conference calls, and it was my understanding that the team is generally supportive of the GDPR's intent and of the importance of European contributors. If I'm mistaken in this, I can certainly focus my efforts elsewhere. Perhaps I should park this issue until our next call..?

snacktavish commented 5 years ago

It came up in the context of our collaboration with OneZoom, who are a UK based non-profit and want to link out to our commenting system. So we should be compliant.

jar398 commented 5 years ago

That's a good reason. I was just curious. Thanks!

mtholder commented 5 years ago

see https://news.ycombinator.com/item?id=16509755

jar398 commented 5 years ago

Why do anonymous visitors need a click-through cookie warning? If they are anonymous, isn't it impossible to collect data of the sort that GDPR is concerned with? Not a GDPR scholar, just asking. Maybe the click through can be delayed until the point of login?

These GDPR things really bother me but I have not sufficiently educated myself about the actual requirement. I just get the impression that we see an awful lot more of these click-throughs than are strictly necessary under the statute, especially in situations where there is no data at stake. If all cookies trigger GDPR, which I doubt, then a site can simply not store cookies in the user's browser.

jimallman commented 5 years ago

Why do anonymous visitors need a click-through cookie warning? If they are anonymous, isn't it impossible to collect data of the sort that GDPR is concerned with?

Thanks, I'm chasing down answers to these now. I'm a bit concerned with anonymous visitors leaving feedback/comments, since they're prompted to add an email that is stored on Github. But maybe we can wait until that moment comes to raise privacy issues.

jimallman commented 5 years ago

Quick survey of personal information stored in the webapps:

In the web2py database user table, we store some information related to their Github account (required for login):

In web2py session files, we temporarily store:

All other data of consequence is stored in Github (and our API server's mirrored repos):

And a reminder from issues past:

[If a Github acccount's email] is verified and email is public it behaves correctly. As far as I can tell, is there is no public email associated with the account, even if there is a verified github email, commits will show up as anonymous.

So this suggests one option for users if they want to (for whatever reason) contribute anonymously -- to withhold their email address from public view on Github. But I suspect we'd still get their userid and display name during Github authentication, so they'd need to lie about those on Github as well. Sigh. 🙄

jimallman commented 5 years ago

As we'd hoped, we can rely on Github to address some user privacy requests:

After an account has been deleted, certain data, such as contributions to other users' repositories and comments in others' issues, will remain. However, we will delete or deidentify your personal information, including your user name and email address, from the author field of issues, pull requests, and comments by associating them with the ghost user.

Github also has a dispute resolution to handle privacy complaints.

If we think we need one, we can enter into a data-protection agreement with Github for GDPR compliance. I believe in order to do this, we'd (as an organization) also need to upgrade to their Corporate Terms of Service.

Also, as a reminder, we have a wiki page that explains how and why we use Github to save personal information.

jimallman commented 4 years ago

Consider a few refinements to the current version: