[Security and GDPR Issue] ProtonMail includes Google Recaptcha for Login, every single time.

cookiengineer commented 3 years ago

Description:

A recent change over the course of the last two weeks led to re-visiting, re-logging-in users. Recaptcha is now injected and compromising a machine's identity on every single login; especially so if cookies are deleted afterwards to preserve user privacy.

Steps to reproduce the behavior:

Use any adblocker of choice (e.g. uBlock Origin with Cookie Autodelete)
Go to https://mail.protonmail.com/login
Find out ProtonMail is using Google Recaptcha, compromising privacy of all its already registered users.

Expected behavior:

As a project/company that was founded as an immediate response to the Snowden Leaks, which revealed that the Google PREFs cookie is literally how the NSA tracks users across the planet, I find this very absurd to see.

I understand that there's intention to lower the rate of spammer accounts in the Registration process. But reoccuring users that have -TWO- passwords to identify themselves with should not need to re-identify themselves as a human. And especially not with an unethical service such as Google that seem to not respect any privacy laws that are applicaple in the European Union.

To be honest, this issue is for me a reason to change services; and I feel betrayed in the sense that I as a crowdfunding campaign sponsoring user think that this is a serious breach of GDPR law. I'm a European citizen (from Germany) and I never agreed to share any information with Google.

I also understand that other Recaptcha using services are necessary when ProtonMail would face lots of TOR traffic (which actually would also endanger journalists abroad btw). But this web traffic was received by ProtonMail without any Proxy in between, from my ISP's geo-ip-confirmable IP.

Currently, if ProtonMail continues to deanonymize its users by including Google's Recaptcha code, I cannot recommend ProtonMail as a service to anyone anymore.

OS is ArchLinux
Browser is Ungoogled Chromium (latest) with uBlock Origin and Cookie Autodelete.
URL is mail.protonmail.com

edit: I wanted to clarify the narrative that ProtonMail tries to make. This Captcha appeared AFTER I entered the correct password for my login, and AFTER I entered the correct password for mailbox decryption. After clicking through three almost unsolvable captchas, I was led straight to the Inbox view.

This was no anti-bruteforce measurement. This was no anti-credentials-stuffing measurement. This was a false positive in classifications by IPv4 (as I have an ISP that shares their IPv4, as all customer hardware uses IPv6 primarily) (read below to what I think can be done to help mitigate this problem).

AdmiralNemo commented 3 years ago

I personally totally understand ProtonMail's arguments from a technical perspective. I can appreciate their position, and I can of course offer no advice on how else they could have handled this problem. I definitely see the value in using existing, proven technology to solve a difficult problem.

What concerns me is the message this sends about the overall state of privacy on the Internet. ProtonMail is acompany founded on the principle of providing private communication. They regularly distribute propaganda about how "bad" Google is for privacy. If they cannot solve this problem without resorting to using one of the most invasive Google products, then who can? I came to ProtonMail deliberately to get away from Google, and I pay them to support that. I definitely don't like the idea that it may no longer be possible to even pay to avoid Google. This incident has prompted me to search again for another provider that will hopefully consider this as scary as I do.

bartbutler commented 3 years ago

I personally totally understand ProtonMail's arguments from a technical perspective. I can appreciate their position, and I can of course offer no advice on how else they could have handled this problem. I definitely see the value in using existing, proven technology to solve a difficult problem.

What concerns me is the message this sends about the overall state of privacy on the Internet. ProtonMail is acompany founded on the principle of providing private communication. They regularly distribute propaganda about how "bad" Google is for privacy. If they cannot solve this problem without resorting to using one of the most invasive Google products, then who can? I came to ProtonMail deliberately to get away from Google, and I pay them to support that. I definitely don't like the idea that it may no longer be possible to even pay to avoid Google. This incident has prompted me to search again for another provider that will hopefully consider this as scary as I do.

We understand your concerns, as stated in earlier comments, this is only hitting a small fraction of legitimate users and will be replaced in the coming weeks with a non-Google CAPTCHA solution, and in the longer term hopefully more clever non-CAPTCHA techniques to minimize the need for CAPTCHA.

AdmiralNemo commented 3 years ago

We understand your concerns, as stated in earlier comments, this is only hitting a small fraction of legitimate users and will be replaced in the coming weeks with a non-Google CAPTCHA solution, and in the longer term hopefully more clever non-CAPTCHA techniques to minimize the need for CAPTCHA.

You say that, but unfortunately, I and clearly several others on this thread, have lost a lot of trust in you because of this incident. Consider, for example, how long we have been hearing "we are working on a non-Google push notification system for Android" as well.

bartbutler commented 3 years ago

Thank you @hakusaro for the constructive reply. I think several of your questions have been answered in various other comments but I'll do it here as well for clarity.

Modern CAPTCHAs are extremely difficult to build and require extraordinary amounts of resources to stay ahead of automated CAPTCHA solvers, including those published by Google itself.

I think you recognize part of the problem, and that's good, but I think you've miscalculated the cost/benefit analysis on this front.

Based on everything I've read, it seems like you have some attacker who has the following attributes:

Has sufficient resources to either operate a large scale residential botnet, or has enough money to borrow the services of a large scale residential botnet.

Likely has a list of compromised passwords from another service.

Has some reason to get into these accounts that is assumably something more valuable than the lulz, because of number one.

Correct. We believe they rent the network. There are several commercial outfits that sell this kind of "service" today unfortunately.

If you've deployed recaptcha and that's solved your problem, allow me to congratulate you: you have an attacker who is motivated, but not that motivated. Human-based captcha farms exist, and are a cost effective and time efficient way of solving captcha related problems for attackers who are motivated. If your users are bitcoin billionaires, you will see the attack morph when either they become aware that it's no longer working, or they exhaust other more-easily-attacked targets.

Yes, for sure. The attacks we've seen are broad-based rather than targeted. My guess is that the economics of CAPTCHA solves doesn't work for them when they have to make many millions of attempts per day to compromise enough accounts to be viable for their purpose. At DeathByCaptcha prices, $3/1000 solves for millions of attempts per day is real money.

If your issue is that paid accounts are being targeted, you actually do have two-factor authentication: payment methods. If you have renewal revenue and cards on file, you can easily prompt users who don't have 2FA but do have a card on file for card-related information.

This is possible but also leaks this information to attackers, which is a privacy consideration as well.

If you issue is that free accounts are being targeted, your solutions are a bit limited. You do, of course, have the option of deploying some captcha. I think this thread demonstrates that at least some percentage of users are not okay with this solution. Are these paid users? Free users? Can you disable the captcha for paid users somehow? Can you implement a two-step login flow, where users enter their email and either are given a captcha, or prompted for 2FA, or are more strictly rate limited based on the plan type?

We could, again this would leak this kind of information to anyone who knows how to look. There's no specific category of accounts being targeted.

On your frustration with WebAuthn: I totally understand where you're coming from here. WebAuthn has a lot of downsides, and the biggest one is that it makes authentication over two different domains difficult. I suggest exploring either off-the-shelf central authentication services, or build some kind of single "authentication domain" that can implement OpenID Connect or SAML, in the long term.

That is precisely what we have done. Stay tuned.

Obviously, there are no perfect solutions here. I just think that if your product is happy to pitch security as a feature, you should consider more liberal application of creativity. Just a few examples, which may or may not be helpful:

If you haven't already implemented a leaky bucket/gcra solution for logins, implement it!

We have, such a system has been in place for years.

If you implement a leaky bucket/gcra, provide absolutely no feedback to the user when they're being rate limited. Instead, return a generic "invalid username/password" type message, so the attacker doesn't know that their IP is now banned/rate limited.

There are some places where we do this obfuscation. Login is not one of them. In our experience, such a message would be terrifying for legitimate users and cause a deluge of customer service complaints, and indeed, has in the past when we've tried related things. This is currently where the CAPTCHA comes in.

If users are being compromised on the first attempt (e.g., because of reused credentials), proactively notify users who are at risk, initiate password reset, or require manual account recovery. If it's a "one and done" approach it would actually be better if you had the attacker using the same set of IPs, because you could just sinkhole those IPs and not tell them.

This whole situation has arisen because they apparently have access to unlimited IPs and do not reuse them. We are also building more tools to notify users of compromised credentials but that's tricky as well to do safely, and as a private email service we do not have many ways to force verification of identity, so we can't disable account pre-emptively and require reset via, say phone. Anyway, this is actively in progress.

For all you know, it could be an attacker's goal to show that you're deficient on privacy by using recaptcha. Your competitors are getting free marketing collateral by being able to say "we'll do something more clever than implement recaptcha." I'm not saying you need to go all conspiracy theory on this stuff, but as someone who recently had to mitigate a few attacks and didn't use recaptcha, I know that there are good alternatives and strategies you can take.

In the short term, you might be stuck, but long term I'd suggest building out a plan for "quick and easy non-captcha-related solutions" that can be deployed. Assume that either the current or future attacker will be sufficiently motivated that a captcha will be defeated by a captcha farm, and work from there.

Yes, if the economics make sense for them, an attacker will get through CAPTCHAs as well. As discussed in an earlier comment, we are also working on several non-CAPTCHA strategies as well, but these take some time to develop and support on all clients.

waltercool commented 3 years ago

For people doing weird and nonsense attacks, this is how it looks like.

https://imgur.com/a/ue9c7UZ

Of course there is traffic to Google at first time by requesting and sending captcha, but there is no tracking cookie from my review. Technically Google may know you did a captcha, but no way to identify.

vladimiry commented 3 years ago

This is a long discussion, so maybe I have missed this point already been discussed upper in the thread. Has the option of allowing access to login entry point only from certain user-defined IP addresses or countries/regions been considered in the team? Of course, then there would be a need for a recovery option for urgent access from a random location, maybe a list of one-time codes + customer support. I guess at least the corporate users would appreciate such security enhancement.

cookiengineer commented 3 years ago

For people doing weird and nonsense attacks, this is how it looks like.

https://imgur.com/a/ue9c7UZ

Of course there is traffic to Google at first time by requesting and sending captcha, but there is no tracking cookie from my review. Technically Google may know you did a captcha, but no way to identify.

Go to Network > Select the URLs for recaptcha, and take a look at the HTTP headers for "E-Tag". Then write the E-Tag headers down somewhere, just for the sake of fun.

Go Incognito Mode, ProtonMail, Recaptcha again. Do the same as instructed before. Voila - same E-Tag headers. You've successfully been tracked and identified without a Cookie because that's how E-Tags work - they're connection specific and can be only deleted when the Browser Cache is deleted completely - while all long-life sockets have been flushed and ended (see chrome://net-internals > Sockets).

Reproduce this on another machine to verify that for each Browser, and each Machine, from the same IP, you'll get another ETag response and the Web Browser is happy to send the If-Match request headers once reconnecting to the same URL; given that it's not a Weak ETag with a W/ prefix :)

If you don't believe me, you can read further on what they do in the ETag article on MDN.

waltercool commented 3 years ago

@cookiengineer I took a look, but there is no E-Tag for me. Pasting all the headers from browser.

Note: I don't have google.com cookies blocked or any special ban for that website.

General

Request URL: https://www.google.com/recaptcha/api.js?onload=loadCaptcha&render=explicit
Request Method: GET
Status Code: 200 
Remote Address: 142.250.64.164:443
Referrer Policy: strict-origin-when-cross-origin

Response Headers

alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
cache-control: private, max-age=300
content-encoding: gzip
content-length: 575
content-security-policy: frame-ancestors 'self'
content-type: text/javascript; charset=UTF-8
cross-origin-resource-policy: cross-origin
date: Mon, 31 May 2021 23:02:33 GMT
expires: Mon, 31 May 2021 23:02:33 GMT
server: GSE
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block

Request Headers
:authority: www.google.com
:method: GET
:path: /recaptcha/api.js?onload=loadCaptcha&render=explicit
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
referer: https://mail-api.protonmail.com/
sec-fetch-dest: script
sec-fetch-mode: no-cors
sec-fetch-site: cross-site
sec-gpc: 1
user-agent: <Removed for obvious reasons>
onload: loadCaptcha
render: explicit

General

Request URL: https://www.google.com/recaptcha/api2/anchor?ar=1&k=6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn&co=aHR0cHM6Ly9tYWlsLWFwaS5wcm90b25tYWlsLmNvbTo0NDM.&hl=en&v=sG0iO6gHcGdWJzjJjW9AY49S&theme=light&size=normal&cb=3ebtvvo1nuxo
Request Method: GET
Status Code: 200 
Remote Address: 142.250.64.164:443
Referrer Policy: strict-origin-when-cross-origin

Response Headers

alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
cache-control: no-cache, no-store, max-age=0, must-revalidate
content-encoding: gzip
content-length: 20235
content-security-policy: script-src 'report-sample' 'nonce-YrUVSKBvfDeLdomXaTW8kQ' 'unsafe-inline' 'strict-dynamic' https: http: 'unsafe-eval';object-src 'none';base-uri 'self';report-uri https://csp.withgoogle.com/csp/recaptcha/1
content-type: text/html; charset=utf-8
date: Mon, 31 May 2021 23:02:34 GMT
expires: Mon, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
server: GSE
x-content-type-options: nosniff
x-xss-protection: 1; mode=block

Request Headers
:authority: www.google.com
:method: GET
:path: /recaptcha/api2/anchor?ar=1&k=6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn&co=aHR0cHM6Ly9tYWlsLWFwaS5wcm90b25tYWlsLmNvbTo0NDM.&hl=en&v=sG0iO6gHcGdWJzjJjW9AY49S&theme=light&size=normal&cb=3ebtvvo1nuxo
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
referer: https://mail-api.protonmail.com/
sec-fetch-dest: iframe
sec-fetch-mode: navigate
sec-fetch-site: cross-site
sec-gpc: 1
upgrade-insecure-requests: 1
user-agent: <Removed for obvious reasons>

Query String Parameters
ar: 1
k: 6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn
co: aHR0cHM6Ly9tYWlsLWFwaS5wcm90b25tYWlsLmNvbTo0NDM.
hl: en
v: sG0iO6gHcGdWJzjJjW9AY49S
theme: light
size: normal
cb: 3ebtvvo1nuxo

Took a look to other requests going to Google, like assets and js, but no E-Tag found on my side.

waltercool commented 3 years ago

Also, seems like this is the original requests, and likely the one absorbing the background information with Google

General
Request URL: https://mail-api.protonmail.com/core/v4/captcha?Token=OAehsglTtuh9i9nqFS8a7LIr
Request Method: GET
Status Code: 200 
Remote Address: 185.70.41.85:443
Referrer Policy: strict-origin-when-cross-origin

Response Headers
access: application/vnd.protonmail.api+json;apiversion=4
cache-control: max-age=0, must-revalidate, no-cache, no-store, private
content-encoding: gzip
content-length: 31182
content-security-policy: default-src 'self'; script-src 'self' 'unsafe-eval' 'nonce-YLVrCcI3CkWwkDgz2IrkrAAAANI'; style-src 'self' 'nonce-YLVrCcI3CkWwkDgz2IrkrAAAANI'; frame-src https://www.google.com/recaptcha/; report-uri https://reports.protonmail.ch/reports/csp;
content-type: text/html; charset=UTF-8
date: Mon, 31 May 2021 23:02:33 GMT
expect-ct: max-age=2592000, enforce, report-uri="https://reports.protonmail.ch/reports/tls"
expires: Fri, 04 May 1984 22:15:00 GMT
public-key-pins-report-only: pin-sha256="<Unnecessary to expose>"; pin-sha256="<Unnecessary to expose>"; report-uri="https://reports.protonmail.ch/reports/tls"
referrer-policy: strict-origin-when-cross-origin
set-cookie: Session-Id=YLVqyVhdBKFG-EHnpJU@egAAABM; Domain=protonmail.com; Path=/; HttpOnly; Secure; Max-Age=7776000
set-cookie: Version=default; Path=/; Secure; Max-Age=7776000
strict-transport-security: max-age=31536000; includeSubDomains; preload
vary: Accept-Encoding
x-content-type-options: nosniff
x-permitted-cross-domain-policies: none
x-xss-protection: 1; mode=block; report=https://reports.protonmail.ch/reports/csp

Request Headers
:authority: mail-api.protonmail.com
:method: GET
:path: /core/v4/captcha?Token=OAehsglTtuh9i9nqFS8a7LIr
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
cookie: Session-Id=<Unnecessary to expose>
referer: https://mail.protonmail.com/
sec-fetch-dest: iframe
sec-fetch-mode: navigate
sec-fetch-site: same-site
sec-gpc: 1
upgrade-insecure-requests: 1
user-agent: <Removed for obvious reasons>

Query String Parameters
Token: OAehsglTtuh9i9nqFS8a7LIr

And completing the form:

General
Request URL: https://www.google.com/recaptcha/api2/userverify?k=6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn
Request Method: POST
Status Code: 200 
Remote Address: 142.250.64.164:443
Referrer Policy: strict-origin-when-cross-origin

Response Headers
alt-svc: h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
cache-control: private, max-age=0
content-encoding: gzip
content-length: 535
content-security-policy: frame-ancestors 'self'
content-type: application/json; charset=utf-8
date: Mon, 31 May 2021 23:34:55 GMT
expires: Mon, 31 May 2021 23:34:55 GMT
server: GSE
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block

Request Headers
:authority: www.google.com
:method: POST
:path: /recaptcha/api2/userverify?k=6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
content-length: 4820
content-type: application/x-www-form-urlencoded;charset=UTF-8
origin: https://www.google.com
referer: https://www.google.com/recaptcha/api2/bframe?hl=en&v=sG0iO6gHcGdWJzjJjW9AY49S&k=6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn&cb=sw0v7k2zoyz
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
sec-gpc: 1
user-agent: <Removed for obvious reasons>

Query String Parameters
k: 6LcWsBUTAAAAAOkRfBk-EXkGzOfcSz3CzvYbxfTn

Form Data
v: sG0iO6gHcGdWJzjJjW9AY49S
c: <Long unnecessary string>
response: eyJyZXNwb25zZSI6WzEsNSw3XSwiZSI6ImJXOFI0anNSMmtQVHRLbHROSVRaVkxDOGFJZyJ9
t: 20425
ct: 20425
bg: <Long unnecessary string>

ph00lt0 commented 3 years ago

This is insane. Not implementing U2F but 'instead' implementing Google's spyware, well done! The arguments about not implementing 2FA are insane to me. Many companies using multiple domains have figured out how to do this. If this is all to complicated for Protonmail I really start to worry about general quality of the product.

bartbutler commented 3 years ago

This is insane. Not implementing U2F but 'instead' implementing Google's spyware, well done! The arguments about not implementing 2FA are insane to me. Many companies using multiple domains have figured out how to do this. If this is all to complicated for Protonmail I really start to worry about general quality of the product.

Other companies do not normally have to manage and share cross-domain secret client-side information that can't be shared with the server. It complicates SSO a bit.

ph00lt0 commented 3 years ago

I disagree here. I have seen and worked on many SSO solutions with U2F at various companies. It would be better to invest your time in a central login system instead of adding this google crap.

Other companies do not normally have to manage and share cross-domain secret client-side information that can't be shared with the server. It complicates SSO a bit.

bartbutler commented 3 years ago

I disagree here. I have seen and worked on many SSO solutions with U2F at various companies. It would be better to invest your time in a central login system instead of adding this google crap.

Other companies do not normally have to manage and share cross-domain secret client-side information that can't be shared with the server. It complicates SSO a bit.

Just to be clear, these aren't actually very closely related discussions--U2F isn't a solution to the problem at hand.

almasen commented 3 years ago

Very happy to see the privacy policy update:

Data related to the opening of an account

[...] In order to pursue our legitimate interest of preventing the creation of accounts by spam bots or human spammers, we use a variety of human verification methods. Verification may also be requested for some sensitive operations besides account creation in order to protect against brute-force attacks. You may be asked to verify using either hCaptcha (or reCAPTCHA in the event that hCaptcha is unavailable), Email, or SMS.

As of today, Privacy Policy last modified at June 8, 2021

Definitely a step in the right direction.

Qix- commented 3 years ago

or reCAPTCHA in the event that hCaptcha is unavailable

Can a box be shown that draws attention to the cases where this happens, e.g. NOTE: hCaptcha is unavailable, and as such we have fallen back to Google's reCAPTCHA. If you do not want to initiate a connection to Google, you may choose to reload the page to try again, or attempt to login at a later time.

It would be even better if Google's scripts weren't loaded until clicking on some sort of acknowledgement link first, too. That gives the user 100% control over whether or not they partake in the Google nonsense without compromising any of the security goals of ProtonMail.

tomkel commented 3 years ago

why was this closed?

bartbutler commented 3 years ago

It was closed as part of the migration to the new React codebase and mono-repository for Proton webapps. That said, we switched to 100% hcaptcha in mid-June, as well as made improvements in how often CAPTCHA triggers for non-abusive users, so it is appropriate to close this particular issue anyway.

Qix- commented 3 years ago

Thank you for the update and actually following through on promises. Increasingly rare occurrence for companies to do so these days, especially when security and privacy are involved.

markcellus commented 3 years ago

It was closed as part of the migration to the new React codebase and mono-repository for Proton webapps.

@bartbutler where is this mono repo located? Also, you guys do know that React is owned and maintained by facebook, right?

bartbutler commented 3 years ago

This is it (this repository): https://github.com/ProtonMail/WebClients

Re: React, yes we do, but we don't hold that against the framework itself. Facebook employs a lot of excellent engineers.

mourednik commented 3 years ago

It was closed as part of the migration to the new React codebase and mono-repository for Proton webapps.

@bartbutler where is this mono repo located? Also, you guys do know that React is owned and maintained by facebook, right?

Facebook also contributes to the Linux kernel. @bartbutler Your system is using Linux (written by Facebook engineers) therefore it is insecure garbage. I'm cancelling my ProtonMail account!

markcellus commented 3 years ago

Facebook isnt garbage and their engineers aren't the problem. But if you all are going to base your entire frontend on Facebook's technology, I hope you all have a plan when it falls behind and is no longer maintained. React has major interpolation issues with the standard web specification along with performance issues. If you want to lock yourself into that technology, only to have to rewrite all of your code in another year or so when it becomes extinct, fine by me. 😄👌

ProtonMail / WebClients

[Security and GDPR Issue] ProtonMail includes Google Recaptcha for Login, every single time. #242

Data related to the opening of an account