SeleniumHQ / selenium

A browser automation framework and ecosystem.
https://selenium.dev
Apache License 2.0
30.74k stars 8.19k forks source link

[🐛 Bug]: GDPR infringment, US plausible telemetry without consent #14588

Closed aeris closed 1 month ago

aeris commented 1 month ago

What happened?

Since #13173, Selenium Manager track usage with Plausible, without consent. This is GDPR violation.

Using telemetry without consent is GDPR violation, and also violation of multiple EDPB guidelines about this topic. I don't copy all the thread here, but rational available on the same kind of trouble on https://github.com/thunderbird/thunderbird-android/issues/8199#issuecomment-2394447403

Worse, you use plausible.io, hosted on 143.244.56.50 IP, which is DataCamp, a US company, exposed to FISA request and so also trouble with Schrems I & II. DataCamp is not even DPF approved (or at least I can't find them on https://www.dataprivacyframework.gov/list), and so it's a GDPR article 50 violation too.

How can we reproduce the issue?

Just use Selenium with anti-tracker, got warning because Plausible usage

Relevant log output

2024-10-11 17:28:51 WARN Selenium [:selenium_manager] Error sending stats to Plausible: error sending request for url (https://plausible.io/api/event)

Operating System

Not applicable

Selenium version

Ruby selenium-devtools 0.127.0

What are the browser(s) and version(s) where you see this issue?

Not applicable

What are the browser driver(s) and version(s) where you see this issue?

Not applicable

Are you using Selenium Grid?

No response

github-actions[bot] commented 1 month ago

@aeris, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

diemol commented 1 month ago

I understand your concern, the tool assures us that it is GDPR compliant and it was properly reviewed.

This is the same as the thread you referred to, so I will leave the same link here as a reply https://github.com/SeleniumHQ/selenium/pull/13173#issuecomment-2160783385.

Thank you.

aeris commented 1 month ago

It wasn't. Pretty every software say they are GDPR compliant. Way too many if not all are not even closed to that in practice… Even outside of the software compliance, you also need administrative process to be compliant. Do you have a DPA signed with Plausible? Did you check Plausible sub-contractor? Did you audit each year Plausible/DataCamp infrastructure? Did you write a processing registry? Select the right legal basis? Etc. Taking a "GDPR compliant" software, deploying and using it, worse on SaaS mode, is not GDPR compliant, not even it can be possible without HUGE administrative task before that point.

joerg1985 commented 1 month ago

Slightly off topic, sorry in advance: I just had a look at https://www.selenium.dev/ and could not find a Privacy Policy including a hint to Analytics, like e.g. on the FSF website. Is such a thing necessary?

aeris commented 1 month ago

Yes

diemol commented 1 month ago

I went and checked Plausible's documentation and our subscription to make sure I was saying the right thing above. I also got in touch with them and their co-founder replied with helpful information:

We're a privacy-first web analytics startup that's built with GDPR in mind. We don’t use cookies, we don’t generate any persistent identifiers and we don’t collect or store any personal or identifiable data.

We’re happy to provide information on how Plausible is built to help you comply with the different privacy regulations. You can read more about how Plausible works in our data policy.

We exclusively use EU-owned cloud infrastructure so your site data never leaves the EU and EU owned infrastructure. And we do have a DPA that has been signed when creating an account: https://plausible.io/dpa

There is also a legal assessment of GDPR-compliant web analytics without consent written by a data protection lawyer that you can check out: https://plausible.io/blog/legal-assessment-gdpr-eprivacy

@aeris Having said that, if you have anything concrete, please reach out to our lawyers.

@joerg1985 might be a good idea to add that, would you like to help us and add that page?

diemol commented 1 month ago

Also, feel free to check the information Plausible has:

aeris commented 1 month ago

There is also a legal assessment of GDPR-compliant web analytics without consent written by a data protection lawyer that you can check out: https://plausible.io/blog/legal-assessment-gdpr-eprivacy

These legal assessment point exactly the trouble with such integration

To benefit from the exemption from consent Provided that the conditions are met, we therefore switch from an opt-in to an opt-out regime.

So where are you opt-out option?

GDPR not permit "GDPR compliant tool" certification even self-assessed GDPR compliant is always a 3 check certification : processing, legal basis, tool. Plausible is only the tool. The 2 others cases must be assessed. Just saying "we use Plausible, it's GDPR compliant, we are good" is just pure bullshit. https://www.edpb.europa.eu/system/files/2024-10/edpb_guidelines_202401_legitimateinterest_en.pdf

For processing, you need to explain why you need such stats, what you collect, how you anonymize the data, why you need such processing, and why you can't use another way less intrusive. There is no current user information of the processing when using Selenium. The processing is then unlawfull.

Even in the case it would be, you need then a legal basis. In such case, legitimate interest. Then you need to pass the triple test. Legitimate, necessary, proportionate. Even the first is not easy because you have no information before processing as stated above. Necessary seems difficult too, we have no idea of what you do with data, if there are critical enough to be really necessary and even statement from your team show there are not necessary (i don't find again the github post but somebody said something like "if it's too difficult, we can remove this feature", I will edit if I find it again). So legitimate interest 6(1)f is NOT possible and only consent.

Even if 6(1)f would be possible, it mean opt-out as said in the legal statement above, right to object (article 21), right to access (article 15), legal DPA signed with Plausible with yearly audit for there infrastructure as stated on article 28 for contractor/sub-contractor, legal validation like international data transfert (yes, plausible is under UK law, which is no more EEE/GDPR covered since Brexit) article 44, and many other things. I bet you cover nothing.

So no, your processing is currently clearly unlawfull, even if using a "GDPR compliant" tool like Plausible.

diemol commented 1 month ago

You probably missed this link as well: https://www.selenium.dev/documentation/selenium_manager/#data-collection

We have all our information public. As the link says, we are using this to understand Selenium usage.

@aeris If you have anything concrete, please get in touch with our lawyers at https://sfconservancy.org/.

Please also keep in mind your tone. We are respectful and listen to your comments, whereas you come to us with a harsh tone.

aeris commented 1 month ago

I find the link I speak about above : https://github.com/SeleniumHQ/selenium/pull/13173#issuecomment-2160783385

So the options were either not to collect any information or to do so the way we have

Because you admit yourself you CAN not collecting at all this data mean you MUST not collecting it at all. Article 5 allow only strictly necessary processing, and article 6(1)f specifically for triple test about legitimate interest (legitimate, necessary, proportionate)

aeris commented 1 month ago

You probably missed this link as well: https://www.selenium.dev/documentation/selenium_manager/#data-collection

It's not missed. It's just not a legal one shown at least at the first processing. Article 21(4) GDPR: https://www.privacy-regulation.eu/en/21.htm

  1. At the latest at the time of the first communication with the data subject, the right referred to in paragraphs 1 and 2 shall be explicitly brought to the attention of the data subject and shall be presented clearly and separately from any other information.

Please also keep in mind your tone. We are respectful and listen to your comments, whereas you come to us with a harsh tone.

No you are definitively not. We are fed up of "GDPR compliant processor keeping very seriously your data privacy" which not even know a single line of GDPR in practice and refuse to admit unlawful processing stated by multiples EDPB guidelines or other explicit statement on the law by itself.

You point to me "legal statement" you don't even read correctly or at least don't follow the explicitly given statement saying legitimate interest MUST provide 1- explicit information and 2- way to opt-out. You provide NONE. You come to me with statement like "we use those data to understand usage" which is EXPLICITLY REJECTED as a legal basis because can't be explicit enough to cover the article 5 requirement, and both by EDPB (https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2013/wp203_en.pdf page 16) and CJUE (https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:62021CJ0252#point123).

It's the same endless thing each time, for pretty any piece of software we need to use as end user, with company saying "if not ok, just look for my lawyer" with lawyer not even able to read correctly this text in place since 2016 for GDPR and 1995 for Directive 95 which is the ancestor of the GDPR and a word-by-word copy.

For pretty every single software, we literally have to take countless time to explain why a software break the law and violate our fundamental rights. And instead of receiving a "oups, sorry, we really care about your privacy, you are right, this processing is unlawful and we just remove it for the future, and we fire our obviously incompetent lawyers", we have to fight against decade long unlawful processing. Each. Fucking. Time.

aeris commented 1 month ago

To show the trouble, we already have this exact same discussion with the exact same month-long discussion, with the same "we respect privacy, speak to my lawyer" shit on

Each time it hours, days, weeks if not monthes or years to be able to have those unlawfull features removed from "privacy aware" FLOSS…

github-actions[bot] commented 1 day ago

This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.