w3c / captcha-accessibility

Inaccessibility of CAPTCHA
https://w3c.github.io/captcha-accessibility/
Other
3 stars 4 forks source link

Privacy considerations in CAPTCHA recommendations #38

Open ysmartin opened 6 years ago

ysmartin commented 6 years ago

I would like to raise my concerns with respect to the privacy issues that might appear when using some of the solutions presented, and especially those recommended in the conclusions of the draft. I am aware that this issue does not directly address the questions for which feedback was requested, but I still thing it is worth to take it into consideration.

  1. Many CAPTCHA alternatives rely on third parties which implement and provide the CAPTCHA service, and which capture personal information from the interaction with the user.
    1. Single sign-on services delegate authentication and identification to an Identity Provider (IdP), separate from the service provider (SP). Different business models exist for those SSO services, but quite often, this identity provider usually provides the service for free, in exchange of exploiting the user's identity for its own business purposes. This raises privacy concerns, e.g. most SSO services provided by large Internet players allow the IdP to know which services are being accessed by the user, effectively tracking their Internet use. (This issue is called "Observability by central instances" in the literature, there are other potential privacy issues, such as the SP accessing some identity attributes disclosed by the IdP) Details are discussed here: Privacy by Design in Federated Identity Management
    2. reCAPTCHA is run by Google, which gathers and processes personal data in order to determine whether the user is a human, including Google cookies themselves. Other implementations run by other companies woould have the same issues. Further explanation here: Privacy Policy for reCAPTCHA Besides, the same central-authority observation concerns may apply here.
  2. Some other CAPTCHA alternatives rely on the disclosure of a lot of personal details certified by a third party, e.g. in Public Key Infrastructure-based (PKI) solutions. This means that the content or service provider has access to the identity of the user. Besides, nothing precludes that user from automating the provision of the certificate in response to the request. This happens because PKI provides identification of the user, which is orthogonal to the fact that he is a human.

I am aware that this is a matter of trying to fulfil different categories of requirements (e.g. accessibility, privacy, security), which sometimes may require making trade-offs. I am also aware that the current state of the technique might not provide better solutions (on the other hand, state-of-the-art techniques such as anonymous credentials or attribute-based credentials allow certifying attributes without disclosing the personal details to either party). Nonetheless, I would suggest:

  1. Including considerations of privacy issues, same as security considerations are mentioned in the document.
  2. Avoiding any strong, direct recommendation of CAPTCHA techniques that may introduce privacy issues, providing instead recommendations conditioned to the advanced of other, privacy-enhancing techniques.
  3. Consulting with the W3C Privacy Interest Group (PING) on privacy issues of this note.

(Edited to fix grammar)