privacycg / nav-tracking-mitigations

Navigation-based Tracking Mitigations
https://privacycg.github.io/nav-tracking-mitigations/
35 stars 15 forks source link

Bounce tracking and technical stability for PETs #64

Open adnyfl opened 1 year ago

adnyfl commented 1 year ago

Hello everyone, thanks for all the work you’ve been doing to limit harmful cross-site tracking and protect user privacy and security. It’s great to see how the browsers are putting an end to the systematic collection and sharing of personal data.

I want to bring back a question that has been mentioned in passing but was never properly considered. Once measures like Bounce Tracking Mitigation are in place, how will browsers ensure, from a technical perspective, that it is still possible for privacy-enhancing technologies (PETs) to operate? I’m hoping we can focus on the technical standardisation of the approach, on the understanding that innovation and competition are good for users and publishers so long as privacy, security and user experience are protected.

The BTM Explainer has a few short paragraphs on Block/Allow Lists but it’s unclear how this will work in practice. Let’s take the scenario of a technology that uses bounce tracking or an alternative cross-site tracking technology in a privacy-preserving way - not some sort of ad tech whitewash but something that meets the most stringent legal requirements on privacy (perhaps even on a path towards achieving zero trust status). For example, a PET that by design and default only tracks cross-site data with opt-in user consent, makes data deletion and portability easy, aggregates and processes the data in the browser, minimises the data accessible by the tracking company and does not share personal data with any third party. It is both possible and desirable for the market to produce software that delivers the benefits of cross-site data without compromising privacy and security, but the rules must be clear.

How will browsers provide technical stability for PETs? Mozilla has done a great job explaining their red lines at disconnect.me, although manually curated allow-lists are probably better for privacy than blocklists. Either way, the ground rules for PETs must be tightly defined for transparency and consistency.

Keen to hear the group’s thoughts.

wanderview commented 12 months ago

The bounce tracking mitigations spec attempts to honor user consent by looking for a user interaction (click, webauthn key tap, etc) on a top-level 1P page of the given etld+1 site. The length of time this interaction protects the site is a browser-defined value, but in chrome its currently implemented as 45 days.

The underlying principle behind this is that we have some minimal signal that the user has taken an action on the page while the site is visible in the URL bar.

In your system that collects user consent, do you not require the user to make a click or other interaction on your domain? If not, what prevents your system from collecting this kind of consent?

adnyfl commented 12 months ago

The user has the ability to make a click and interact on the root domain, which is in fact encouraged and facilitated. But there can be no requirement to do so because it would compromise the user experience. Consent is received by default from the publisher’s consent management platform. Unlike Topics, where one single consent choice allows Google to collect data from a large number of domains, in the case described consent must be provided by the user on each domain.

The system does not work on a domain without receiving the consent signal (e.g. it does not work on most US sites today) and our priority is to ensure that (a) there are no dark patterns at play when consent is collected and (b) there is an easy way for the user to withdraw the consent.

We’re not totally fond of current consent standards but European courts and data protection authorities are driving improvements, and hopefully browsers will come up with a better solution in due course. But for now this is what we have in the open web. More fundamentally, bounce tracking cannot be considered separately from on-device aggregation, processing and anonymisation which differentiate the PET from harmful forms of tracking. When assessing user privacy and security one has to look at the whole system, not just one part in isolation.

wanderview commented 11 months ago

Our goal is to protect user privacy to the greatest extent possible while preserving critical user journeys on the web. To do this the browser needs technical signals that the user is both aware and approves of cross-site data sharing. The user interaction requirement in bounce tracking mitigations was selected as a very low cost signal that could provide some indication of this awareness and approval.

While the bounce tracking mitigation interaction signal "just works" for many user journeys on the web, it will not match all existing patterns on the web. We've done our best to minimize the need for changes, but invariably some sites will need to make changes to support new privacy protections. It seems the use case you describe may fall in this category.

Note, site changes being required for web ecosystem improvements is not unique to this situation. For example, due to security issues with the appcache API we worked with sites to migrate to the new service workers API as an alternative. That change was much more costly for sites to implement compared to the interaction requirement for bounce tracking mitigations.

There are a couple of solutions we can suggest. While both would require code changes and some user action to signal to the browser intent and understanding, we do not think this user friction should be too onerous in cases where the user already understands what is happening. For example, when a site explains that they need to use the camera for the current user journey, users are very likely to accept the camera permission prompt.

The possible solutions we can currently suggest are:

adnyfl commented 11 months ago

We’ve demoed both proposals and the claims that the user interaction on the third-party domain is a “low-cost signal” and that user friction wouldn’t “be too onerous” do not stand up to scrutiny. Below is a summary of the user journey you would expect websites to create.

Acknowledgement notice proposal

Expected user journey:

Please also note that your proposal of adding an ‘acknowledgement notice’ and then redirecting the user back to the original domain is hardly an improvement to privacy or transparency. It actually reads more like a trick to circumvent the artificial interaction rule created by BTM. Under GDPR, the user can’t merely ‘acknowledge’ something: they would again need to provide ​​freely given, specific, informed and unambiguous consent. In order to be lawful, your proposal would therefore require the user to provide the same consent twice, once on the publisher's site and once on the third-party site.

As you surely know, the consent popup already creates friction and is widely seen as creating a bad user experience across the open internet, so much so that authorities are thinking about how to change it. The consent popup involves one popup and one click. Adding two redirections, one additional popup and one additional click would make the open internet unusable. No publisher would accept to compromise their user experience in this way. Moreover, publishers worry about anything that redirects the user away from their domain and requires an action to be brought back because this carries a high risk of losing traffic. This proposed solution is a commercial non-starter.

Storage Access API proposal

Expected user journey:

While we are open to using storage access API, this proposal is hardly an improvement on the first one. It erroneously equates the PET to a third-party cookie, it creates a horrendous user experience and mandates the use of iframes that publishers would not allow on their sites. None of this is intuitive for the user, acceptable to a site owner or scalable in any way. In sum, this proposal is also practically and commercially unviable.

Two-tier system

Compare this to the user journey Google has created for the Privacy Sandbox, where: (a) either consent is not collected at all, replaced by an opt-out mechanism (meaning that the user is neither "aware" nor "approves of cross-site data sharing") or (b) a consent prompt appears automatically in the browser window and one single click is sufficient for the user to accept tracking on almost all known domains (and to simultaneously accept that Topics will be shared with third parties). The Privacy Sandbox already has a much simpler user journey than the PET’s current approach to consent (which must be obtained on each domain), and your proposed ‘solutions’ would require the user to perform at least two clicks and two redirects, or three clicks and three popups on every domain. This is a perfect example of a two-tier system in which Google uses its gatekeeping position to gain significant advantages over competitors.

If the goal is to protect user privacy, the Privacy Sandbox and the other browser measures cannot be the only solutions. If your concern is about user choice, it is undoubtedly easier to obtain ​​freely given, specific, informed and unambiguous consent at the domain level (the level the PET operates) than at the browser level (the level Topics operate). Sure, consent processes can and should be improved for everybody - for Google products, for the platforms, for publishers and for independent vendors - but the proposals go in the wrong direction for the user and create an uneven playing field.

In the short term, using an allow list for tracking use cases that meet privacy standards provides a solution that leaves enough space for the market to innovate while enforcing high standards of privacy by default. In the longer term, improvement to consent processes across all web properties will be needed. If you have other ideas on how to find a fair solution we are happy to discuss this with an open mind, but please let’s make sure they offer a realistic path forward for users and publishers.

BrianLefler commented 11 months ago

We do not plan to give better support for consented bounce tracking. Your understanding of how websites could operate within the current proposal is correct, but I don’t think Chrome or other browsers will promise stability for unsupported use cases, so the most prudent option for websites is to stop relying on bounce tracking on the same schedule that they would stop relying on third-party cookies.

We arrived at this position because bounce tracking raises the same set of privacy concerns as third-party cookies. Developers can create consented tracking systems right now with third-party cookies, but the browser has no way to guarantee any of that to users. Users want better privacy and it’s why we are phasing out third-party cookies in Chrome. The Privacy Sandbox intends to satisfy important tracking purposes through a dedicated set of APIs which pose less privacy concerns than third-party cookies.

Because a move from third-party cookies to bounce tracking is not an improvement in privacy, we do not think bounce tracking is an appropriate migration strategy.

Bounce tracking does not seem a good technology to build on. Compared to tracking with third-party cookies, bounce tracking is slower and disrupts the user experience during its redirects. Compared to privacy-preserving APIs, it offers less browser privacy guarantees. The sole technical advantage of bounce tracking seems to be its ability to track users who have third-party cookies disabled. We saw that as a privacy hole, not a feature, and bounce tracking mitigations are our attempt to close this hole.

The privacy community and the Technical Architecture Group have been supportive of the current proposal. Changing the proposal as you suggest would reduce browser privacy guarantees and seems likely to erode community support. If you want to advocate further, a possible next step could be to write up a proposed change, contact the W3C’s Privacy Community Group and get on their meeting agenda.

adnyfl commented 10 months ago

In principle, we understand the points you are making. However, your post is problematic in several ways.

Mischaracterisation of bounce tracking Both the statement that “a move from third-party cookies to bounce tracking is not an improvement in privacy” and the characterisation of bounce tracking as a simple ‘cookie workaround’ are wrong and misleading.

  1. First of all, there are design differences that make bounce tracking inherently more privacy-preserving than TPC. The TPC is designed to be read on the publisher domain by other scripts, creating a data surface area that can be used by other companies to identify users. Bounce tracking (BT) is designed to store a FPC on the authentication domain, and no cross-site identifiers on the publisher domains. Of course, BT could be used as a TPC (uploading a cross-domain ID on all domains) but it is not designed to work that way. Just like the Privacy Sandbox could send the data it collects from Chrome to Google servers, but it’s not designed that way. With one key difference: if the Sandbox were to send data to Google servers, it would be extremely hard for anyone to see. If BT were to be used like a TPC, it would be extremely easy for anyone to see - browsers and users alike.
  2. Secondly, your argument that “compared to tracking with third-party cookies, BT is slower and disrupts the user experience during its redirects” is only partly correct: the user experience can be protected, but it is true that BT is not as seamless as a TPC. For example, a publisher can have dozens of TPCs on a site, but only one BT solution. This limits BT usage and pervasiveness, making it even more unlikely to create the sort of sprawling interoperable data that TPCs have enabled over the past 20 years.

Mischaracterisation of our proposal The fact that BTM has attracted support from TAG is welcome and in line with our proposal. We recognise the privacy risks of BT if it is left unchecked, as it could be used like a TPC. That is why we support BTM with an allow-list for advertising uses that are demonstrably privacy-preserving. An allow-list does not "reduce browser privacy guarantees", it just opens the door to alternative PETs. For instance, a technology provider can declare the way it collects and uses cross-site data, be audited and - if privacy requirements are met - added to an allow-list. Browsers can easily check that the technology continues to operate in line with privacy requirements and, if that is not the case, disable it. In fact, because data collection and processing happens in the browser, anyone can see what is going on at all times and expose misuse - as I’m sure our friends at the EFF and ICCL would do.

Misleading suggestion that PET can only be delivered by the browsers The most problematic part of your post is that it implies that privacy can only be delivered through a browser monopoly over PET, despite the poor privacy record and conflicts of interest of some browser-makers. This approach is philosophically, technically, and legally wrong. If a cross-site technology is demonstrably used in a privacy-preserving way, then browsers have no legal basis to disable it for that particular use.

The discussions around the Privacy Sandbox are not about 'setting a standard', they are about building a product. A standard would have clear and transparent privacy rules that everyone could follow to build a product, it would set the privacy objectives and technical guidelines based on regulatory requirements. Instead, the Sandbox and other browsers publish guidelines on how to use their products, without a publicly available privacy standard. This strategic ambiguity allows Google and others to pre-empt all current and future competition, regardless of their privacy merit.

Conclusion Consent-based BT and shared storage are the two most privacy-preserving cross-site approaches. Ben suggested using Shared Storage API as an alternative to BT, but made it conditional on a user journey that is impossible to implement, so it didn’t look like a genuine suggestion. If the offer is genuine, we are open to discussing it. More broadly, the Privacy Sandbox cannot be the only accepted cross-site technology. It’s important that Google and the browsers protect the right of cross-site PETs to operate in the market instead of mandating their own proprietary solutions and shutting the door to all others. So we can all focus on increasing privacy protection using the same set of rules.

dmarti commented 10 months ago

Redirects do slow down navigation, but users already accept a redirect every time they click through on a result from most search engines. Results typically go to a tracking URL that redirects to the result page.

MasterInQuestion commented 3 months ago

= Bounce-Tracking and tech stability of PET (Privacy-Enhancing Technologies) =

[ adnyfl @ CE 2023-10-03 13:30:43 UTC: https://github.com/privacycg/nav-tracking-mitigations/issues/64#issue-1924161077     Hello everyone, thanks for all the work you've been doing to limit harmful cross-site tracking; and protect user privacy and security.     It's great to see how browsers are putting an end: to the systematic collection and sharing of personal data.

    I want to bring back a question, that has been mentioned in past but never properly considered:     [     Once measures like Bounce-Tracking Mitigation (BTM) are in place:     How will browsers ensure, from technical perspective: that it is still possible for some PET to operate? ]

    I'm hoping we can focus on the technical standardisation of the approach:     On the understanding that innovation and competition are good for users and publishers:     So long as privacy, security and user experience are protected.

    The BTM Explainer has a few short paragraphs on Block/Allow Lists:     https://github.com/privacycg/nav-tracking-mitigations/blob/main/bounce-tracking-explainer.md#blockallow-lists     ; but it's unclear how this will work in practice. \ \     Let's take the scenario of a technology, that uses bounce-tracking, or alternative cross-site tracking technology in a privacy-preserving way:     Not some sort of adtech whitewash, but something that meets the most stringent legal requirements on privacy: perhaps even on a path towards achieving zero-trust status.

    For example, a PET that by design, and default only tracks cross-site data with opt-in user consent:     |*| Makes data deletion and portability easy.     |*| Aggregates and processes the data in browser.     |*| Minimises the data accessible by the tracking company. And does not share personal data with any third party.     .     It is both possible and desirable for the market to produce software:     That delivers the benefits of cross-site data, without compromising privacy and security.     But the rules must be clear. \ \     How will browsers provide technical stability for alike PET?     Mozilla has done great job explaining their red lines at: https://disconnect.me/trackerprotection     ; although manually curated allow-lists are probably better: for privacy; than blocklists.

    Either way, the ground rules for PET must be tightly defined for transparency and consistency.

    Keen to hear the group's thoughts. ]

----

    How could alike non-sense be ever considered "Privacy-Enhancing Technologies"..?

    "Feel safe to leak to us. We promise won't leak."