patcg / proposals

This repository is to discuss proposals before they've been spun off into their on repository.
Other
19 stars 4 forks source link

Why would notice and consent not be adequate? (Notice and consent debate) #5

Open jwrosewell opened 2 years ago

nlongcn commented 2 years ago

Thank you Robin. I did find that useful and I would appreciate any further documentation.

From: Robin Berjon @.> Sent: 03 March 2022 17:13 To: patcg/proposals @.> Cc: nlongcn @.>; Comment @.> Subject: Re: [patcg/proposals] Why would notice and consent not be adequate? (Notice and consent debate) (Issue #5)

Kiran asked a good question on the public list that for some reason was not captured here as well. I am answering here to make sure we keep it in a single place. He asked if the issue with consent "is a limitation of browsers which cannot share significant portions of cross-context reading history at scale?"

The short answer is that this isn't a limitation of browsers but a limitation of what people can consent to through the kind of large-scale interaction that exist on the Web and through browsers. But if you don't have the background on this topic, I think that this answer won't be satisfactory. So I thought it would be helpful to provide a short backgrounder on consent so that not everyone has to read all the things just to reach the same conclusion. In the interest of brevity I will stick to the salient points regarding consent that have brought us to the present day; experts on the topic should of course chime in if they feel I've missed an important part.

Informed consent as used in computer systems today (and specifically for data processing) is an idea borrow from (pre-digital) research on human subjects. One particularly important foundation of informed consent is the Belmont Principles, most notably the first principle, Respect for Persons. The idea of respect for persons is that people should be treated in such a way that they will make decisions based on their own set of values, preferences, and beliefs without undue influence or interference that will distort or skew their ability to make decisions. The important thing to note here is that respect for persons is meant to protect people's autonomy in contexts in which their ability to make good decisions can be impaired.

The way that this is operationalised in the context of research on human subject is through informed consent. At some point, someone looked at this and realised that things like profiling, analytics, A/B testing, etc. look a lot like research on human subjects (which is true). And so they decided to just copy and paste informed consent over on computers, with the expectation that it would address problems of autonomy with data.

As often happens when people copy the superficial implementation onto computers but without the underlying structure that makes it work, this fell apart. First, one key component of research on human subjects is the Institutional Review Board (IRB), an independent group that reviews the research for ethical concerns. IRBs aren't perfect, but using an IRB means that in the vast majority of cases unethical treatment is prevented before any subject even gets to consent to it. Some companies do have IRBs (The Times does, as does Facebook for instance) but they can never be as open, independent, and systematic as they are in research. Second, the informed consent step is slow, deliberate, with a vivid depiction of risks. Subjects are often already volunteers. You might get a grad student sitting down with you to explain the pros and cons of participation, or a video equivalent.

What's really important to understand here is that informed consent is not about not using dark patterns and making some description of processing readable; it's about relying on an independent institution of multidisciplinary experts to make sure that the processing is ethical and on top of this independent assessment of the ethics of the intervention taking proactive steps to ensure that subjects understand what they are walking into. There are Web equivalents of informed consent - studies based on Mozilla Rally are a good example of this - but they work by reproducing the full apparatus of informed consent and not just the superficial bits that make the lawyers happy. Rally involves volunteering (installing an extension), gatekeeping to ensure that studies are ethical (eg. the Princeton IRB validated the studies I'm in), volunteering again to join specific studies and being walked through a description before consenting, and then strong technical measures to protect the data (like, it is only decrypted and analysed on devices disconnected from the Internet).

None of this scales to the kind of Web-wide data processing that is required to make our advertising infrastructure work (or to enable many other potentially harmful functions). People have tried, but as shown repeatedly by the research I linked to previously (and more generally all the work on bounded rationality) it doesn't work. What "doesn't work" means is that relying on consent for this kind of data processing means that you end up with a lot of people consenting when in fact they don't want what they are consenting to; they are only doing it because the system is directing them in ways that don't effectively align with the requirements of informed consent. (To give just one example, Hoofnagle et al. have found that 62% of people believe that if a site has a privacy policy that means that the site can't share their data with other parties. Informed consent means eliminating that kind of misunderstanding and then providing a detailed explanation of the risks. It's a steep hill and few people have to time for it.)

One possible reaction upon learning this is to not care. Some people will say "well, it's not my fault that people don't understand how privacy law and data work - if they don't like it, we gave them a 'choice'." But giving people a choice that you already know they will get wrong more often than not isn't ethical and doesn't align with respect for people.

As members of the Web community, however, we don't want to build unethical things. The Web is built atop the same ethical tradition that produced informed consent in research on human subjects: respect for persons. (We formulate it as putting people first, but it's the same idea.) Since we try our best to make decisions based on reality rather than on what is convenient, we can't in good conscience see that consent doesn't work and then decide to use it anyway. There is also a fair bit of evidence that relying on consent favours larger, more established companies which makes consent problematic from a competition standpoint as well. Because of this, it is incumbent upon us to build something better. (In a sense, we have to be the IRB that the Web can't have for every site.)

Is it technically possible to overturn this consensus? Of course. But we have to consider what the burden of proof looks like given the state of knowledge accumulated over the past fifty years that people have been working on this. Finding a lack of consensus requires more than just someone saying "I disagree," it would require establishing that respect for persons is secondary (and reinventing informed consent on non-Belmont principles) or that bounded rationality isn't real or high-powered empirical studies showing that people aren't tricked out of their autonomy or some other very significant scientific upheaval. It might be possible, but we're essentially talking about providing a proof of the Riemann hypothesis using basic arithmetics: I don't believe that it's been shown that you couldn't do that, and there are very regularly people who claim to have done it, but it would be unreasonable to put anything on hold for that in the absence of novel, solid evidence.

I hope this is helpful for people who haven't been wrestling with this topic. What the charter delineates is helpful because it protects this group from walking down blind alleys that have been explored extensively with no solution in sight. If people find this kind of informal background helpful, I would be happy to document it more prominently.

- Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpatcg%2Fproposals%2Fissues%2F5%23issuecomment-1058276865&data=04%7C01%7C%7Cb84c27439e4c494da30408d9fd390fab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637819243782966472%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=eHgQ7QoLceau42754IFqfHoSL15y1ljRxEF7j7kzAQE%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOGXTKJSCMAWGCKYV5ZABCDU6DXJPANCNFSM5N7A556Q&data=04%7C01%7C%7Cb84c27439e4c494da30408d9fd390fab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637819243782966472%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=7DB6L7DRCU4S%2FoysFHnHBsebhAaUCCo5xFcnXG4Dtpo%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOShttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7Cb84c27439e4c494da30408d9fd390fab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637819243782966472%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=q7QErEtVbDY3bwQD6Ey1UjlSCwFLwMANxnFBxSZiWTA%3D&reserved=0 or Androidhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7Cb84c27439e4c494da30408d9fd390fab%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637819243782966472%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6IRsEzHO5PddFS3dndc36%2FbjGPf4M2Kb7z8aP3eRfP4%3D&reserved=0. You are receiving this because you commented.Message ID: @.**@.>>

ansuz commented 2 years ago

Thanks for your extremely thorough and articulate explanation, @darobin !

I've been following along with this discussion and generally haven't felt like the average web user is represented by many of the participants here.

It seems to me that many here take it as an accepted first principle that the web without advertising is inconceivable, and it follows that the best we can do is make advertising somewhat less terrible.

In my opinion, the possibility that unregulated or unsupervised web advertising cannot be reformed must not be beyond consideration. Arguing that "there is no alternative" is not a good look.

I will leave you all with a meme to consider:

why-does-X

lknik commented 2 years ago

First of all, I agree with some people who issued earlier comments, that this thread is rather no longer providing much value. Secondly, I wanted to comment on one point made by @darobin

That consent cannot provide a defensible approach for the kind of processing this group is working on does not mean that this becomes the "Fixing Consent Community Group." ...

That is fair. However, some laws may still need consent, even if private processing is required. Even if we may or may not consider such laws as outdated, they nonetheless may or may not exist in the EU and the UK, depending on the technical solution. (hence the "may or may not")

lbdvt commented 2 years ago

Hi @darobin,

It seems to me that the kind of consent mechanisms you describe (Institutional Review Board...) are well suited for things like medical research (e.g. "Are you willing to take an experimental treatment for a life-threatening disease?"), but that there's a big difference in complexity and consequences compared to web advertising (e.g. "If you click ok, your visit on nicefurnitures.example may be used to show you ads later on the web.").

Could you please expand on the use cases under:

At some point, someone looked at this and realised that things like profiling, analytics, A/B testing, etc. look a lot like research on human subjects (which is true)

?

drpaulfarrow commented 2 years ago

Love the additional background, Robin, thanks!

It would be great if you could expand a little bit on certain points...

For example, you say: "And so they decided to just copy and paste informed consent over on computers, with the expectation that it would address problems of autonomy with data."

And:

"As often happens when people copy the superficial implementation onto computers but without the underlying structure that makes it work, this fell apart."

Who are you referring to when you say 'they', and how has it fallen apart in your view?

Also, I would love to hear your thoughts on the applicability of 'broad consent', as opposed to the more classical definition of 'informed consent' as it is stated in the Belmont Report (which actually itself acknowledges that "presenting information in a disorganized and rapid fashion, allowing too little time for consideration or curtailing opportunities for questioning, all may adversely affect a subject's ability to make an informed choice.", which would seem to make it a poor choice for webpage applications from the get-go!)

Cheers!

darobin commented 2 years ago

@lbdvt The thing we need to be very careful about is to not assume that how we want the system to be used is how it actually gets used. If we could ensure that your visit to Nice Furniture™ could only be used to show you a furniture ad for a comparatively short period of time thereafter, we'd be in a very different position compared to the one we're in now. If a study emerges in a few years showing that students whose parents are into nice furniture do less well in college, this data could be used by some universities to turn your kids down in the future. Evidently, this is a deliberately contrived example and it wouldn't be possible in all countries but the key is that privacy harms are impossible for people to predict and time shifted. This has already had real consequences, for instance with tracking data used to hunt down undocumented people (WSJ, Vice).

Lack of purpose limitation is a constant source of problems in data. There's a strong sense in which guaranteeing purpose limitations is a key objective of this group.

@drpaulfarrow I don't have a definitive text on the history, but my understanding from reading around isn't that one person one day decided to apply ideas from HSR to computer systems but rather that it happened because those were the conceptual tools "lying around" at the time. For instance, Records, Computers, and the Rights of Citizens already mentions data as research and the term "data subject" which became the norm. These assumptions are also in Convention 108, the 1995 Data Directive, and all the EU texts that follow. I've been meaning to look at the Conseil d'État's 1970 report on this to see what's there. If you're interested, you might be able to dig into the Hessischer Landtag's Vorlage des Datenschutzbeauftragten that was influential at the time.

There is a related thread that concerns more general permissions for a website to access more powerful capabilities (including data) and that has been a recurring unsolved issue in the W3C and broader Web community. It reads like a list of failures trying to rely on consent when risk is involved: ActiveX, Java applet security model, Device APIs & Policy WG (with similar issues in WebApps and HTML WGs), the PowerBox proposal from Mozilla & Sony Ericsson, delegated trust… We looked at the problem again as recently as 2018 and there wasn't much progress in terms of the state of the art.

In terms of what fell apart: the short version is that we are looking at an absence of autonomy in the processing of personal data and significant data protection impacts.

I think broad consent is certainly an interesting way to think about options! I suspect you might have more experience with it than I from your previous work? One key component of broad consent is how what is being consented to is generally quite limited in scope (even if not in details) and still under IRB accountability. I'm not sure that I see it becoming useful in our context because, by the time we've enforced purpose limitations, does consent (of any kind) add something valuable on top?

alextcone commented 2 years ago

Lack of purpose limitation is a constant source of problems in data. There's a strong sense in which guaranteeing purpose limitations is a key objective of this group.

Right on, @darobin

kirangopinath71 commented 2 years ago

<I'm not sure that I see it becoming useful in our context because, by the time we've enforced purpose limitations, does consent (of any kind) add something valuable on top?>

Thanks for the clarification, Robin. Agree with you that purpose limitation will solve a major part of the problem.

Consent might still be required for users to opt in and opt out of specific data sharing, even with purpose limitation. Eg. a 14 year old girl curious about pregnancy test kits might not want to share her reading/search data for any purpose to avoid potential harms (which could range from embarrassment to harassment to more).

There is a required effort to make the consent management more frictionless, easier, less annoying than cookie banners and yet available upfront for the user, which could perhaps be added to the proposed solution.

dmarti commented 2 years ago

@kirangopinath71 This is a good example of a situation where it is important to first evaluate whether or not it is appropriate to ask for consent at all.

When designing a system, we have to take into account the user research literature and what we can be reasonably expected to understand, as human web developers, about human web users.

The number of situations in which any human being would want to share with anyone else that they were browsing a web page about a pregnancy test kit is vanishingly low, so asking for consent is going to produce far more errors than true consent. Asking for consent would not only waste the time of everyone who answered the consent prompt correctly, but also result in inappropriate processing of the information of everyone who failed to get it right. This specific case is a good example of where consent is not a good fit. (Users who really want to share this info can do it on their own)

We have to do a better job of looking at the user research to determine not just when to ask for consent (and how to do it, and when to skip it), but how to apply a "consent yes" setting to real-world data processing decisions. About 36% of users are more likely to engage with personalized ads but that probably doesn't mean that they want all their pharmacy shopping habits shared.

jwrosewell commented 2 years ago

Movement for an Open Web (MOW) have now published analysis of the most recent guidance from UK competition and data protection bodies in relation to consent. This analysis has a bearing on the answer to this question and the role of consent in solutions developed by this group and more broadly across the W3C and other standards forum.

alextcone commented 2 years ago

@jwrosewell - can you make clear what you mean by the title of this Issue?

Why would notice and consent not be adequate?

Adequate for what?

  1. As a replacement technology for the removal of cross-site/app identifiers?
  2. As a mechanism on top of the status quo of cross-site/app identifiers that haven't all gone away yet?
  3. As approaches for user controls for new purpose limited "private ad technologies" incubated by PATCG?
  4. Something else?

As I read through this monster thread it appears there is a lot of miscommunication going on and I think that may be due in part to not finishing the sentence ("adequate for..."). Is it 1, 2, 3 or 4? If 4, please let us know adequate for what.

jwrosewell commented 2 years ago

Question came from the last meeting. Minutes are here.

I've copied the minutes below and added clarification in square brackets.

James R:thanks! Regarding consensus about new APIs. Not clear what the problem is with existing APIs. Discussing some of the very largest companies, but not the majority of participants in this group, and those larger companies are engaging in more groups. At w3c, believe we should use existing functionality / lego bricks. Proposals from different browsers or gatekeepers tend to play to their functionality/advantage. Not clear why we need to make any change from existing APIs.

Ben: Facebook doesn’t operate a major web browser, but looking at large browser vendors to find the possibility of shipping an API across major browsers. Don’t want to waste time on proposals that won’t be shipped by major web browsers.

James: confused why notice and consent wouldn’t be adequate [because that is a position major web browsers seem to have taken and is a constraint that proposers seem to be working to]. For default on, not sure who controls the defaults [and how users consent to these defaults]. Why [default to a proposal that] use[s] a multi-party compute solution?

Martin: We do not have the time to fully answer that question.

Aram: Agreed, please open an issue in the proposal space or on the issue thread.

This question and the resulting thread and interest suggests it might be important to answer considering all the information provided.

alextcone commented 2 years ago

So it seems the second half to the original Issue question of 'Why would notice and consent not be adequate" is:

  1. As a mechanism on top of the status quo of cross-site/app identifiers that haven't all gone away yet?

I base this interpretation of your indirect reply directly above on the following quote from your reply directly above (emphasis mine):

James R:thanks! Regarding consensus about new APIs. Not clear what the problem is with existing APIs. Discussing some of the very largest companies, but not the majority of participants in this group, and those larger companies are engaging in more groups. At w3c, believe we should use existing functionality / lego bricks. Proposals from different browsers or gatekeepers tend to play to their functionality/advantage. Not clear why we need to make any change from existing APIs.

If my interpretation is correct then it seems your intention for this Issue was to propose this group focus on a notice and consent mechanism for existing APIs (I presume third party cookies?) and not whether any net new purpose limited APIs, like IPA for example, should be subject to best practices and legal requirements as far as user-level transparency and control mechanics go. Regarding the latter, I don't see anyone objecting to making new APIs subject to best practices and legal requirements as far as user-level transparency and control go.

So if the intent behind the issue was about notice and consent on top of existing web APIs I believe you should say this plainly (without making people dig through minutes and make further inference). I think a lot of the back and forth of this thread could have been avoided.