WICG / first-party-sets

https://wicg.github.io/first-party-sets/
287 stars 75 forks source link

Is FPS good for users? #53

Open martinthomson opened 3 years ago

martinthomson commented 3 years ago

First-party sets (FPS) violates some central principles of the modern web. Specifically, the priority of constituencies puts user interests first; and the Vegas Rule says that what happens on one site stays with that site.

The ideal these principles describe is one where the identity that a person presents to a site is under their control. In particular, the identity that a user presents to a site is something that is specific to that site.

FPS substitutes the user-visible notion of a domain with the abstraction of "organization". In doing so, it removes control over how users interact with sites.

FPS would deprive users of a choice about whether information propagates across an organization through their various interactions with that organization. In place of control, users are given the sop of being able to discover the extent to which they have lost control. FPS proposes new, untested mechanisms that aim to improve visibility of shared ownership, which are not an adequate substitute for control.

So, the issue name is not an accident.

If there is any benefit to users, that benefit is at best indirect. Users only benefit to the extent that the sites they value might be better able to function.

The Bargain

This gets to the central concern. People have built sites that rely on cross-site cookies. Taking those cross-site cookies and storage away makes those sites stop working.

Long term, the answer to that is likely something browser-mediated. That might be storage access (I don't think that it is, for reasons similar to the ones I describe here) or it might be something more narrowly focused like a federated identity-specific API (in exemplar only, that proposal is a very long way from being realized).

Obviously, those sorts of long-term things take time. In the interim, it makes sense to provide sites that depend on cross-site state help. Early implementations of in-browser controls for cross-site state have exemption lists. Those lists are a manifestation of browser-specific policies about what sort of information transfer is needed to keep the web functioning for most users.

FPS attempts to codify the practice of building lists by allowing sites to self-declare. It replaces browser-curated allow lists with self-declaration of the same, plus browser-curated deny lists. Recently, there was an attempt to define a singular policy for those deny lists, so that sites would have more certainty about compatibility. To give due credit, if you start from the assumption that lists are necessary, this is a good thing as uncertainty and inconsistency only hurts smaller actors.

The effect of all of this is that the short-term solution becomes a permanent one. A FPS declaration that complies with agreed standards for inclusion carries an expectation that the bargain is respected. There is usually no time limit on these bargains on the web. Pursuing standardization for FPS effectively makes it an indefinite commitment.

The browser-curated lists constitute no such promise. Indeed, there is an understanding that the lists are temporary. This is obviously imperfect - and increasingly so over time - but the intent is clear. And, partly owing to the inconsistency in how lists are implemented, no site can rely on them.

All of this is to say that we're in a transition period. In the previous state, activity on any site could be joined through the use of cross-site cookies. In the liminal state we're currently in (for most browsers, if not most users), a limited number of sites retain that capability through a patchwork of special exclusions. In the state we seek, users can interact with sites and present a different identity to each site without that information being used without their knowledge and direct engagement.

FPS would be better than the old state of the web. But if we are to consider it part of the future of the web, it does not appear to be a good deal for users.

History of this Issue

I apologize for opening an issue that is effectively a duplicate of numerous other issues, but it seems like this central point is not being addressed in any meaningful way. This issue has been raised in a few different ways, but I'm trying to make a cogent, singular issue that addresses what I think is the most important question to address.

This issue has been discussed in a number of places. I reviewed open issues and found clear statements to the same effect in #6, #7, #14, #30, and likely more closed issues. There are also mentions of the notion or issues adjacent to it in #19, #21, #22, #23, #28, #31, #40, #49, #50, and likely more closed issues.

There is a lot of cost and complexity to building and maintaining a system like the one that is proposed here, with lots of interesting implications for how sites, intermediaries, and adjacent services operate. I see most of the other issues being related to that. But answering the big question in the negative makes a lot of that moot.

johnwilander commented 3 years ago

Thanks for writing this up, Martin! Apple WebKit has repeatedly said in the context of FPS that we have no intention of relaxing cookie blocking based on something like FPS. We would instead use FPS to reason about other things in the engine. An example we've mentioned is to potentially alter wording in the prompt for storage access of a dedicated SSO domain within a set. That is why we've said that we'd like the FPS proposal to be reasonably disconnected from its intended use in particular browsers.

I believe Chrome intends to allow automatic cross-site cookie access within a set which causes a lot of the concerns you have. It has been brought up that even just Chrome relaxing their future cookie blocking in this way may put significant pressure on other browsers to do the same which is a real concern even if the FPS proposal doesn't mandate it.

pes10k commented 3 years ago

FWIW, Brave disables FPS in Blink and does not currently have any plans to enable, for any purpose.

AramZS commented 3 years ago

Yeah, thanks for writing this all up! I think I see the purpose of first party sets within the context of the intermediate state that you discussed, but with the end of 3p on Chrome pushed further back now, it seems to me like building a feature like FPS, whose purpose is mostly to cover the conditions of that intermediate state, is less needed now. I agree that there are a lot of interesting issues that FPS intendeds to address, but that the attempt to bucket them all together to be solved by FPS seems the wrong way to go.

It might be useful to look at a break out session to step back, look at the specific issues that FPS intends to address, and other purposes it might be useful for, such as @johnwilander's interest, and move to addressing those more directly without being blocked by the additional concerns that FPS specifically creates?

martinthomson commented 3 years ago

@johnwilander if Safari has one use for this and Chrome another, with Brave disabling it entirely, is that a good outcome? Sites won't be able to rely on a consistent set of browser behaviour.

I think that @krgovind did a good job of articulating Google's vision here. That is, they would like to define a privacy boundary for the web. The whole eTLD+1/registerable domain thing is largely an accident of history rather than something deliberate and trying to tame that is a worthy goal.

I'm pushing back on the idea that this boundary needs to be expanded to cover multiple domains, not just in the manner specifically described, but in any way. The specifics of the proposal don't really matter until we agree on the goals.

I have a keen interest in solving some of the use cases that were used to motivate FPS. I also think that some of those do not need solving at all. For most of those, longer-term solutions are more narrowly targeted are far better, like some sort of browser-mediated federated login, which would target SSO scenarios more precisely even than isLoggedIn+FPS. And if we are talking remedial work for those who can't justify a major bit of rearchitecture, I generally point to storage access.

johnwilander commented 3 years ago

@johnwilander if Safari has one use for this and Chrome another, with Brave disabling it entirely, is that a good outcome? Sites won't be able to rely on a consistent set of browser behaviour.

I would love consistency on this front but look at where we are. Is it likely that we'll ever reach consensus on tracking prevention? I don't want to give up on that dream but not much is pointing in that direction.

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access. Browsers that want to relax cookie blocking within first party sets can automatically grant storage access within each set, whereas browsers that don't want to relax cookie blocking can instead ask for the user's permission along the lines of today's rules or provide partitioned cookies. That way browsers would be guaranteed to get a chance to impose their policy and not be pushed to regress their privacy protections because of interoperability issues.

jespertheend commented 3 years ago

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access.

That sounds like the way to go imo. But it would be nice to have a way to know in advance if the user agent will show a permission pop up before making a call to requestStorageAccess(). Sometimes storage access is not super important and there are cases where the intrusiveness of showing a permission popup is not worth the benefits of storage access.

johannhof commented 3 years ago

(Not sure whether the maintainers find this sort of meta-discussion particularly well placed in a GitHub issue, but it's certainly interesting)

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access. Browsers that want to relax cookie blocking within first party sets can automatically grant storage access within each set, whereas browsers that don't want to relax cookie blocking can instead ask for the user's permission along the lines of today's rules or provide partitioned cookies. That way browsers would be guaranteed to get a chance to impose their policy and not be pushed to regress their privacy protections because of interoperability issues.

We suggested this in #42. The main thought (to rephrase what you're saying) is that FPS includes an (async) option for denial. This allows browsers to either ask users or even disable it out of principle, and the website needs to react to this possibility. There's still a risk here that sites will start consent-walling their access instead of dealing with failure gracefully, thus eliminating real user choice, but at least we'd ensure user participation before cross-site access is allowed.

johannhof commented 3 years ago

Sometimes storage access is not super important and there are cases where the intrusiveness of showing a permission popup is not worth the benefits of storage access.

Are you talking about a scenario that "isLoggedIn" would solve? https://github.com/privacycg/storage-access/issues/8

Other than "logged-in vs. logged-out users" I'm not aware of other "know who to prompt" kind of issues. Would be interesting to know about your specific case, but please add a comment in the above issue to avoid side-tracking the FPS discussion :)

johnwilander commented 3 years ago

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access.

That sounds like the way to go imo. But it would be nice to have a way to know in advance if the user agent will show a permission pop up before making a call to requestStorageAccess(). Sometimes storage access is not super important and there are cases where the intrusiveness of showing a permission popup is not worth the benefits of storage access.

Much better to allow the caller of the Storage Access API to state what it wants than to expose state to the caller. An option to document.requestStorageAccess() that says “just give me partitioned cookies if you’d have to prompt” or “ just block if you’d have to prompt,” depending on the browser’s policy.

krgovind commented 3 years ago

Thanks for the discussion here, everyone!


Responding to @martinthomson

First-party sets (FPS) violates some central principles of the modern web. Specifically, the priority of constituencies puts user interests first; and the Vegas Rule says that what happens on one site stays with that site.

Thank you for starting from first principles. Since we are discussing the long-term, desired end state for the web; that is precisely where we should begin.

Interestingly, the Vegas Rule reference that you linked to actually says “what happens with the first party stays with the first party”, where “first-party” is not defined as “site”, but as “a data controller of the data processing that takes places as a consequence of a user interacting with it.”. This is consistent with our UA Policy Proposal for First-Party Sets which requires that domains within a set must have a common owner, and common controller.

In similar vein to the Vegas Rule, we see that in privacy principles defined by various browsers, as well as the Do Not Track specification; tracking is defined as that which happens across “parties” (not domains or sites).

We see a similar theme on other platforms, such as Apple’s App Tracking Transparency framework which defines tracking as linking of user data “collected from other companies’ apps, websites, or offline properties”. On iOS, as an example, this notion of “company” is enshrined in the “vendor” metadata of the app. I see FPS as playing a similar role for the web.

User understanding, and enabled of user workflows/journeys (my interpretation of "user interests") is core to the purpose of FPS. For this reason, our proposed policy includes the requirement that "Domains must share a common group identity that is easily discoverable by users". You will see from the same document that we had previously also considered "common user journey" as another requirement, but discarded it due to the difficulty of enforcing that requirement. However, we could reconsider it, or potentially list that among recommendations or best practices for site authors.

If there is any benefit to users, that benefit is at best indirect. Users only benefit to the extent that the sites they value might be better able to function.

The scenarios where users benefit directly is where an application is deployed over multiple domains for the purpose of sandboxing/privsep of untrusted content. Quoting from the FPS explainer: “Hosting untrusted, compromised content on the same domain where a user is authenticated may result in attackers’ potentially capturing authentication cookies, or login credentials (in case of password managers that scope credentials to domains); and cause harm to users.”

Long term, the answer to that is likely something browser-mediated. That might be storage access (I don't think that it is, for reasons similar to the ones I describe here) or it might be something more narrowly focused like a federated identity-specific API (in exemplar only, that proposal is a very long way from being realized).

I agree that for use-cases that are indistinguishable from “tracking”, we should have browser-mediation that involves purpose-specific prompts. Alternative solutions may include applying privacy engineering techniques (such as limiting information entropy, aggregation, on-device processing, etc.). IMO, First-Party Sets significantly whittles down the universe of use-cases that we need to solve for, down to true “third-party” use-cases. This allows us to apply more rigor and strictness to these new browser-mediated/privacy-first APIs; while also dramatically reducing the scale of adoption/deployment challenges since there are orders of magnitude fewer "third-parties" that service "first-parties".

Regarding Storage Access API (SAA), based on your parenthetical, I think we’re aligned that it is not a good long term solution for the web. It works well as a temporary compatibility fix; and for that reason I think it up-ends the priority of constituencies by favoring browser implementers’ convenience, over that of site authors (who in many cases have some re-architecting to do, but can keep much of the site functionality unchanged), and ultimately over that of users. It places on users the onus of making an impossible choice based on a hard-to-comprehend prompt. I say impossible choice, because based on the use-cases that I’ve seen the API prescribed for, the site functionality is simply broken if the user doesn’t grant permission. In addition, I have trouble coming up with a prompt that is comprehensible, because SAA provides access to a broad, general-purpose mechanism (cross-site storage) that can be for any number of applications/use-cases.

FPS would be better than the old state of the web. But if we are to consider it part of the future of the web, it does not appear to be a good deal for users.

It seems like you might be making a conclusion that is different from the privacy principles that have been previously laid out by others. As I understand it, you’re saying that users’ understanding of the site identity and their privacy expectations are based solely on the domain in the URL bar only, and not based on other aspects of the site’s identity such as who the user perceives as the owner or data controller.

My overarching principle for FPS is that it should be used only to enable users’ interactions/workflows across sites, where the user clearly understands that the sites are the same “party”. This is what we’re trying to enshrine in our UA Policy Proposal as well.

— Responding to @johnwilander

An example we've mentioned is to potentially alter wording in the prompt for storage access of a dedicated SSO domain within a set. That is why we've said that we'd like the FPS proposal to be reasonably disconnected from its intended use in particular browsers.

Thanks for stating Webkit’s intended use for FPS, John! I recall that you also mentioned that FPS may have an application in Webkit’s bounce tracking prevention. Specifically, almost all SSO/federated login implementations rely on redirects. My understanding is that this is because authentication providers don’t like users entering auth credentials into cross-site frames. AFAIK, their use of cross-site cookies is limited to workflows where a period refresh of auth tokens is needed (a.k.a session management).

Could you clarify how you plan to disambiguate SSO-related redirects from bounce tracking? Is there an API other than FPS? I don't think SAA would be appropriate, unless the prompt is modified?

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access.

@bslassey wrote up PR #54 to cover this proposal within the scope of FPS. Please take a look at let us know if that covers your suggestion.

johnwilander commented 3 years ago

An example we've mentioned is to potentially alter wording in the prompt for storage access of a dedicated SSO domain within a set. That is why we've said that we'd like the FPS proposal to be reasonably disconnected from its intended use in particular browsers.

Thanks for stating Webkit’s intended use for FPS, John! I recall that you also mentioned that FPS may have an application in Webkit’s bounce tracking prevention. Specifically, almost all SSO/federated login implementations rely on redirects. My understanding is that this is because authentication providers don’t like users entering auth credentials into cross-site frames. AFAIK, their use of cross-site cookies is limited to workflows where a period refresh of auth tokens is needed (a.k.a session management).

Could you clarify how you plan to disambiguate SSO-related redirects from bounce tracking? Is there an API other than FPS? I don't think SAA would be appropriate, unless the prompt is modified?

We haven't decided on details but the idea is to consider a bounce/redirect that isn't between two domains in the same first party set as a candidate for being flagged as bounce tracking. From there, the browser can take various protective measures. One way is to count the "fan out" of redirects from that candidate and at some threshold stop the redirects and start asking the user what their preference is. Note that such a measure would only be needed for a bounce tracker that the browser is unwilling to clear all website data for.

One potential way that's been lurking in my mind is to always require a call to the Storage Access API for cross-site, unpartitioned storage access.

@bslassey wrote up PR #54 to cover this proposal within the scope of FPS. Please take a look at let us know if that covers your suggestion.

That wording doesn't solve what I'm after. What I'm saying is that if e.g. Chrome decides to automatically relax cross-site cookie blocking based on FPS, that will likely put interop pressure on other browsers to do the same. The solution is to require all cross-site cookie access to go through a call to document.requestStorageAccess(). That would mean all embeddees will have to call the Storage Access API, Chrome will resolve and provide access automatically within first party sets whereas other browsers can follow their policy for cross-site storage access as today. It would also enable Chrome to offer its users tighter cross-site cookie controls that don't lead to immediate site breakage because sites assume automatic cross-site cookie access within their set.

krgovind commented 3 years ago

We haven't decided on details but the idea is to consider a bounce/redirect that isn't between two domains in the same first party set as a candidate for being flagged as bounce tracking. From there, the browser can take various protective measures. One way is to count the "fan out" of redirects from that candidate and at some threshold stop the redirects and start asking the user what their preference is. Note that such a measure would only be needed for a bounce tracker that the browser is unwilling to clear all website data for.

Does that mean if I have a First-Party Set with sites {A, B, C}, the "bounce" pattern A->B->C->A would not be classified as tracking; but a pattern such as A->B->D->A could classify D as a potential tracker?

That wording doesn't solve what I'm after. What I'm saying is that if e.g. Chrome decides to automatically relax cross-site cookie blocking based on FPS, that will likely put interop pressure on other browsers to do the same. The solution is to require all cross-site cookie access to go through a call to document.requestStorageAccess(). That would mean all embeddees will have to call the Storage Access API, Chrome will resolve and provide access automatically within first party sets whereas other browsers can follow their policy for cross-site storage access as today. It would also enable Chrome to offer its users tighter cross-site cookie controls that don't lead to immediate site breakage because sites assume automatic cross-site cookie access within their set.

Ah, understood. I just realized the language in PR #54 is actually intended to cover the suggestion in issue #42. I requested Johann to review it.

As I explained in my response to Martin above, while Storage Access API (SAA) is a good short-term/stop-gap measure to help get past site compatibility issues for use-cases that aren't covered by new APIs; I don't believe it is the right long-term solution. On the other hand, I see FPS as trying to define "party" as referenced in various web privacy principles documents, and something I envision as a useful primitive that the browser understands for the long-term. My preference would be to not depend on SAA directly, but find another way to gate cross-site, same-party state behind a permission prompt.

Interestingly, I read your blogpost introducing SAA, and it seems like at the time you were explicitly targeting the "authenticated embeds" use-case for this API. Is there a reason that the problem scope has since expanded far beyond that case?

michael-oneill commented 3 years ago

The idea of the DNT same-party array, a predecessor of FPS, was to allow consent to apply to other origins so that e.g. a browser could decide not to block or restrict access to embedded domains declared as same-party.

A user agent might use the same-party array, when provided, to inform or enable different behavior for references that are claimed to be same-party versus those for which no claim is made. For example, a user agent might choose to exclude, or perform additional pre-flight verification of, requests to other domains that have not been claimed as same-party by the referring site. Tracking Preference Expression (DNT)

If enabled by the user the same DNT header was to be sent to all embedded resources as well as the top-level origin, if the user had given their site-specific consent (as registered by the Consent API - aka the Tracking Exception API, a first-party protocol designed with a similar purpose to SAA) the value would be DNT:0, otherwise DNT:1.

Perhaps this, or something similar, could be the basis of a compremise to enable interopability. Maybe its called GPC now, but for now I'll refer to it as DNT.

If DNT is not specified by the user the browser could determine whether to partition 3rd party cookies itself, and send DNT indicating the result of its (or the user's) decsion.

A third-party origin receiving DNT:1 would know that its cookies were partitioned (by interoperable browsers)., DNT:0 would mean they had definately been allowed access to their first-party (i.e. unpartitioned) cookies, and no DNT header could mean that some browsers might allow them depending on how the user had responded to the consent prompt, or the browser has not been updated or otherwise not interoperable.

As @krgovind points out the same-party array could only apply to domains managed by the same controller. The DNT TPS did not address how this could be enforced, but presumably this could also now be part of a compromise proposal, along with an agreement on the consent registration protocol, e.g. SAA

dmarti commented 3 years ago

@michael-oneill An important difference between GPC and DNT is that DNT established its own definition of "party" and GPC does not. The much GPC spec covers sending a do-not-sell-or-share-interaction, enabled by the user, to a server. GPC only provides a convenient way to send a signal that is defined elsewhere—see the regulations and directives linked to from the Legal Effects section of the GPC spec.

Sites may be required to take actions in response to GPC that are outside the scope of FPS, and FPS could affect browser behavior that is outside the scope of the regulations implemented by GPC.

johnwilander commented 3 years ago

We haven't decided on details but the idea is to consider a bounce/redirect that isn't between two domains in the same first party set as a candidate for being flagged as bounce tracking. From there, the browser can take various protective measures. One way is to count the "fan out" of redirects from that candidate and at some threshold stop the redirects and start asking the user what their preference is. Note that such a measure would only be needed for a bounce tracker that the browser is unwilling to clear all website data for.

Does that mean if I have a First-Party Set with sites {A, B, C}, the "bounce" pattern A->B->C->A would not be classified as tracking; but a pattern such as A->B->D->A could classify D as a potential tracker?

Yes, or at least contribute to such a classification.

That wording doesn't solve what I'm after. What I'm saying is that if e.g. Chrome decides to automatically relax cross-site cookie blocking based on FPS, that will likely put interop pressure on other browsers to do the same. The solution is to require all cross-site cookie access to go through a call to document.requestStorageAccess(). That would mean all embeddees will have to call the Storage Access API, Chrome will resolve and provide access automatically within first party sets whereas other browsers can follow their policy for cross-site storage access as today. It would also enable Chrome to offer its users tighter cross-site cookie controls that don't lead to immediate site breakage because sites assume automatic cross-site cookie access within their set.

Ah, understood. I just realized the language in PR #54 is actually intended to cover the suggestion in issue #42. I requested Johann to review it.

As I explained in my response to Martin above, while Storage Access API (SAA) is a good short-term/stop-gap measure to help get past site compatibility issues for use-cases that aren't covered by new APIs; I don't believe it is the right long-term solution. On the other hand, I see FPS as trying to define "party" as referenced in various web privacy principles documents, and something I envision as a useful primitive that the browser understands for the long-term. My preference would be to not depend on SAA directly, but find another way to gate cross-site, same-party state behind a permission prompt.

Interestingly, I read your blogpost introducing SAA, and it seems like at the time you were explicitly targeting the "authenticated embeds" use-case for this API. Is there a reason that the problem scope has since expanded far beyond that case?

It was very much designed for authenticated embeds and we continued to argue for that scoping. But Mozilla, Microsoft, and developers wanted to expand its scope and that’s where we ended up as part of the standards process. The change from per-frame to per-page scope is one such change. The ongoing discussion on optional partitioned cookies is another case which goes beyond authenticated embeds. Microsoft’s desire to use the API for SSO is a third case.

michael-oneill commented 3 years ago

@michael-oneill An important difference between GPC and DNT is that DNT established its own definition of "party" and GPC does not. The much GPC spec covers sending a do-not-sell-or-share-interaction, enabled by the user, to a server. GPC only provides a convenient way to send a signal that is defined elsewhere—see the regulations and directives linked to from the Legal Effects section of the GPC spec.

Sites may be required to take actions in response to GPC that are outside the scope of FPS, and FPS could affect browser behavior that is outside the scope of the regulations implemented by GPC.

Theres a danger of an population explosion of privacy oriented header names, most of which being just ignored. This signal is different because it informs servers/contexts of action already taken by the browsers that could break their application , its up to them whether they take note of it. How about Partioned: 1

krgovind commented 3 years ago

Responding to @johnwilander

Does that mean if I have a First-Party Set with sites {A, B, C}, the "bounce" pattern A->B->C->A would not be classified as tracking; but a pattern such as A->B->D->A could classify D as a potential tracker?

Yes, or at least contribute to such a classification.

Thanks for the clarity! What I'm confused about is: Why is it okay to exempt same-party sites from bounce tracking mitigations (in the A->B->C->A case) without prompting the user; while access to cross-site, same-party state requires a permission? Couldn't {A, B, C} use top-level redirects to sync state without requiring user permission? (Unless I misunderstood, and you intend to show the SAA prompt on top-level redirects as well)

Interestingly, I read your blogpost introducing SAA, and it seems like at the time you were explicitly targeting the "authenticated embeds" use-case for this API. Is there a reason that the problem scope has since expanded far beyond that case?

It was very much designed for authenticated embeds and we continued to argue for that scoping. But Mozilla, Microsoft, and developers wanted to expand its scope and that’s where we ended up as part of the standards process. The change from per-frame to per-page scope is one such change. The ongoing discussion on optional partitioned cookies is another case which goes beyond authenticated embeds. Microsoft’s desire to use the API for SSO is a third case.

Very useful context. Thanks!


@michael-oneill - Interesting idea with the use of HTTP headers! Narrowing down to such a solution in the context of site-specific user consent for FPS, I think a potential issue with this mechanism may be that it requires browsers to block the page from loading until the user responds to the permission prompt (since the browser knows whether or not to show a prompt, and what same-party registrable domains to list only after the user enters/navigates to the URL). I think blocking the page from loading would be required, since the header would have to be included starting on the very first request made to the server. On the other hand, an asynchronous API such as the one being discussed in #42 would allow the page to load, while sites may choose to selectively re-load portions of the page when they received access to the relevant state.

johnwilander commented 3 years ago

Responding to @johnwilander

Does that mean if I have a First-Party Set with sites {A, B, C}, the "bounce" pattern A->B->C->A would not be classified as tracking; but a pattern such as A->B->D->A could classify D as a potential tracker?

Yes, or at least contribute to such a classification.

Thanks for the clarity! What I'm confused about is: Why is it okay to exempt same-party sites from bounce tracking mitigations (in the A->B->C->A case) without prompting the user; while access to cross-site, same-party state requires a permission? Couldn't {A, B, C} use top-level redirects to sync state without requiring user permission? (Unless I misunderstood, and you intend to show the SAA prompt on top-level redirects as well)

Exempting certain scenarios is not the right framing. This is about finding a path to prevent known cross-site tracking while not breaking legitimate use cases. Blocking, prompting, or intercepting all navigational redirects is not viable.

We figured out in ITP 2017 that it's viable to delete all website data for sites the user doesn't interact with which fixed almost all bounce tracking that existed then. Bounce tracking with user interaction remains to be solved. Being able to reason about who's redirecting to whom based on something like FPS would open up a path to fix another chunk of the problem.

If we could have all developers across the web make changes in sync, we could explore much more complete solutions. But that's obviously never going to happen so we need to keep chipping away.

martinthomson commented 3 years ago

Responding to @krgovind:

[...] our proposed policy includes the requirement that "Domains must share a common group identity that is easily discoverable by users"

This is exactly the point that I want to push on here. The claim implicit in that is that it is possible for group identity to be easily discoverable AND that that is a state that is relevant to users.

Let's get concrete. I happen to be aware that youtube and gmail share ownership. Do I want that shared ownership to be used to link my identity? I know that the owner of those sites wants that, but why would I have a browser adopt a policy that privileges that owner by sharing state? Does this privilege only extent to those entities that are big enough to get that sort of widespread recognition?

And what if someone was unaware of that shared ownership; would their browser have violated their expectations if they allow cookies/state from one site to be seen by the other? Or do we simply consider that person to be sufficiently ignorant that they don't get the privacy they might expect? (For reference, I was not overtly aware that Disney owned ESPN or ABC; what some people see as obvious is not always so.)

The scenarios where users benefit directly is where an application is deployed over multiple domains for the purpose of sandboxing/privsep of untrusted content.

That is very much indirect. The site - and its engineers - are the primary beneficiary. It might make their task easier in some ways, but if they did not use multiple domains and there was an incident, would we not hold them accountable for the incident rather than saying that they didn't structure their domains responsibly?

As I understand it, you’re saying that users’ understanding of the site identity and their privacy expectations are based solely on the domain in the URL bar only, and not based on other aspects of the site’s identity such as who the user perceives as the owner or data controller.

No, I'm saying that you are the one attempting to create a new concept and the onus is on you to establish that that concept is useful/correct.

54

@johnwilander's various points upthread about using FPS to inform policy decisions is an interesting angle here. But it leads to a different manifestation of the same basic compatibility problems. The policy that Chrome adopts will, whether we like it or not, shape site expectations in ways that will affect other browsers. The goal of these standards processes is to decide these things, not shift them off into another layer.

A lot of my objections do depend very much on the sorts of policies that each browser plans to apply these classifications to. If the goal is to enable same-party cookies and storage for sites in a set and that is done without user interaction, then see my examples above. If the goal is to invoke user interaction, then I think we have better and more easily understood options (based on identity). or we might as well continue to fortify storage access.

Maybe we should be talking more about those policies we are each contemplating, because it's gotten to the point where a lot of the relevant functionality has moved into opaque, browser-specific policies.

And while we are on storage access...

it up-ends the priority of constituencies by favoring browser implementers’ convenience, over that of site authors

I might have to disagree there. Or at least point out that it's more nuanced than that. It's a small imposition on sites, which I might agree isn't ideal along those lines. At the same time it aims to arbitrage that imposition into privacy outcomes for users. For the most part, SAA doesn't force sites to completely restructure their services in the same way that a new federated identity API or other purpose-specific API might.

michael-oneill commented 3 years ago

@michael-oneill - Interesting idea with the use of HTTP headers! Narrowing down to such a solution in the context of site-specific user consent for FPS, I think a potential issue with this mechanism may be that it requires browsers to block the page from loading until the user responds to the permission prompt (since the browser knows whether or not to show a prompt, and what same-party registrable domains to list only after the user enters/navigates to the URL). I think blocking the page from loading would be required, since the header would have to be included starting on the very first request made to the server. On the other hand, an asynchronous API such as the one being discussed in #42 would allow the page to load, while sites may choose to selectively re-load portions of the page when they received access to the relevant state.

I don't think the page has to be blocked, just that embeds have thier cookies partitioned unless the browser has set an internal "consented" flag for an embed's origin. The cookies will be sent along with the Partitioned header, and at that point the browser knows ehether they are partioned or not.

The meaning would be as follows: Partioned: 0 these cookies are definately not partitioned., i.e. same as the embeded resource's first-party cookies. This indicates that the user has previously accepted a prompt, or that this domain is in a first-party set which was possibly confirmed by the user. Partitioned: 1 these cookies are partitioned, and will remain so unless the user accepts a prompt. The origin server might initiate an SAA in this case, No Partitioned header means the cookies may or may not be partioned (browser does not support the header, or does not want to leak fingerprinting entropy).

johannhof commented 3 years ago

IMO, First-Party Sets significantly whittles down the universe of use-cases that we need to solve for, down to true “third-party” use-cases. This allows us to apply more rigor and strictness to these new browser-mediated/privacy-first APIs; while also dramatically reducing the scale of adoption/deployment challenges since there are orders of magnitude fewer "third-parties" that service "first-parties".

This general idea always excited me about FPS, but it feels like a lot of the criticism it has seen is rooted in the fact that FPS isn't focused on exclusively trying to solve user-experienced breakage from lack of third-party cookies. You're also inviting origins that likely retain functionality when constrained to continue (or even start) sharing their cookies, such as Disney might be able to do with ESPN, to follow Martin's example.

Attacks on FPS should not only be defined through "establishing an FPS without belonging to the same entity", but also "establishing an FPS without really needing to".

If we were instead trying to solve specific breakage scenarios, then I think we'd be having a more focused discussion about how far we can restrict the API (instead of coming up with additional concepts like Same-Party cookies that further expand capabilities). For example, in my view, the set size vastly changes the implications to privacy and thus the treatment that a party requesting a set should receive. It would be interesting to know if small, unidirectional sets would fix the majority of use cases. I want to echo this suggestion that @AramZS made earlier:

It might be useful to look at a break out session to step back, look at the specific issues that FPS intends to address, and other purposes it might be useful for, such as @johnwilander's interest, and move to addressing those more directly without being blocked by the additional concerns that FPS specifically creates?

We're having these "how far can we go without sacrificing privacy to fit use case X" discussions a lot for SAA. It has made development of the spec a lot tougher, but that's a good price to pay.

Still, as John said earlier, we're often in disagreement about how far we can go to protect user privacy. The prompt/promise helps with this. It allows browsers to treat sites differently: allowing, rejecting, or invoking user participation (or not, up to the browser). That's the reason we're suggesting prompt/SAA support for FPS.

But maybe FPS isn't built for that. It is, as you say, a new primitive for the web, not a tool like SAA. So maybe relaxation of storage requirements just shouldn't be part of the spec, and then we also don't need a prompt. Maybe the part that relaxes storage access should be specified separately (which is what the SAA tries to be, but maybe we can figure out something else).

And yes, SAA may (and hopefully will) at some point be obsoleted by an even better tool, but I suspect there would also be little harm in keeping it on the web platform as it's designed to not compromise on user's privacy.

michael-oneill commented 3 years ago

But maybe FPS isn't built for that. It is, as you say, a new primitive for the web, not a tool like SAA. So maybe relaxation of storage requirements just shouldn't be part of the spec, and then we also don't need a prompt. Maybe the part that relaxes storage access should be specified separately (which is what the SAA tries to be, but maybe we can figure out something else).

Perhaps FPS should be a component of a wider manifest recording machine-readable privacy oriented declaraions, such as ownership of domains, controller contact addresses, common brand association, puposes for processing personal data, purposes for using cookies or other storage, and the like.

Widening the context beyond the technical aspects could leverage existing or emerging legal enforcement, reducing the onus on browsers. The extra information in the manifest can also be used to improve transparency of the UX.

It would also allow browers more implementation flexibility, some might lift cookie restrictions if they feel that the relevant declarations are being rebustly enforced enough, while others might insist on SAA or other confirmation prompts.

krgovind commented 3 years ago

Responding to @johnwilander

Exempting certain scenarios is not the right framing. This is about finding a path to prevent known cross-site tracking while not breaking legitimate use cases. Blocking, prompting, or intercepting all navigational redirects is not viable.

We figured out in ITP 2017 that it's viable to delete all website data for sites the user doesn't interact with which fixed almost all bounce tracking that existed then. Bounce tracking with user interaction remains to be solved. Being able to reason about who's redirecting to whom based on something like FPS would open up a path to fix another chunk of the problem.

I think it's helpful to think about this from the perspective of what principles we are aiming at for the end-state. It helps us reason through our design decisions, but also to provide guidance for developers on how to migrate their websites.

User identifiers can be exchanged across sites via any number of mechanisms; including cross-site cookies, and top-level redirects. If we add more friction on one mechanism (cross-site cookies), but signal that it's okay to achieve the same functionality via a different mechanism (top-level redirects); that sends a confusing message to developers.

If we could have all developers across the web make changes in sync, we could explore much more complete solutions. But that's obviously never going to happen so we need to keep chipping away.

Does this mean that in the ideal/end-state, you would like to clamp down on navigational redirects even across sites within the same FPS? In other words, do you anticipate investing in Storage Access API or a permission-like solution for those as well? (I totally understand that you're suggesting we should chip away at these gradually, but I'm trying to understand the end-state)


Responding to @martinthomson

This is exactly the point that I want to push on here. The claim implicit in that is that it is possible for group identity to be easily discoverable AND that that is a state that is relevant to users.

Let's get concrete. I happen to be aware that youtube and gmail share ownership. Do I want that shared ownership to be used to link my identity? I know that the owner of those sites wants that, but why would I have a browser adopt a policy that privileges that owner by sharing state? Does this privilege only extent to those entities that are big enough to get that sort of widespread recognition?

Just so I can understand this example better - is the concern that YouTube and GMail are separate applications that currently hosted on separate registrable domains? What about if they were hosted on the same registrable domain? Same origin? How do you draw the boundary of your understanding of the application/site's ability to seamlessly share state?

And what if someone was unaware of that shared ownership; would their browser have violated their expectations if they allow cookies/state from one site to be seen by the other? Or do we simply consider that person to be sufficiently ignorant that they don't get the privacy they might expect? (For reference, I was not overtly aware that Disney owned ESPN or ABC; what some people see as obvious is not always so.)

FPS gives browsers an opportunity to highlight that shared ownership to users. This can be done both via requiring something like common branding on the website/content page itself, as well as within the browser UI.

The scenarios where users benefit directly is where an application is deployed over multiple domains for the purpose of sandboxing/privsep of untrusted content.

That is very much indirect. The site - and its engineers - are the primary beneficiary. It might make their task easier in some ways, but if they did not use multiple domains and there was an incident, would we not hold them accountable for the incident rather than saying that they didn't structure their domains responsibly?

I'm not sure what this is getting at, it seems a bit convoluted to me. The way I see it, all of us, as stewards of the web platform, have a responsibility to ensure that developers can build the most secure and private experiences for end-users. If sandboxing content on domains is currently the only reliable way to deliver the most secure experience to users on the web platform, on old and new clients; why is it not fair to characterize that as directly beneficial to users? Are there well known alternative solutions that we should recommend instead?

As I understand it, you’re saying that users’ understanding of the site identity and their privacy expectations are based solely on the domain in the URL bar only, and not based on other aspects of the site’s identity such as who the user perceives as the owner or data controller.

No, I'm saying that you are the one attempting to create a new concept and the onus is on you to establish that that concept is useful/correct.

Actually, the concept of "party" within the context of anti-tracking on the web was created by many others before me. Mozilla included.

If the goal is to invoke user interaction, then I think we have better and more easily understood options (based on identity). or we might as well continue to fortify storage access.

Within the context of granting access to cross-site, same-party state; may I interpret this statement as saying that a satisfactory resolution to #42 would address your concerns? (Yes, there are other applications we envision for FPS that warrant further discussion, but I'd like to start with this one.)

it up-ends the priority of constituencies by favoring browser implementers’ convenience, over that of site authors

I might have to disagree there. Or at least point out that it's more nuanced than that. It's a small imposition on sites, which I might agree isn't ideal along those lines. At the same time it aims to arbitrage that imposition into privacy outcomes for users.

Actually the crucial part of what I was attempting to say was that SAA is better for browsers/website developers than it is for users. It might be useful to take federated login as a case study, and compare the properties of SAA vs. something like WebID. A few specific benefits of WebID that I can state right away: (a) A prompt that makes sense to users, within the context of the action they just performed i.e. clicking on a Login button; (b) Browsers can enforce directed identifiers. (c) Better solves federated login specific flows, such as the one being discussed on privacycg/storage-access/issues/82

For the most part, SAA doesn't force sites to completely restructure their services in the same way that a new federated identity API or other purpose-specific API might.

The WebID folks did an excellent job analyzing the deployment topology for federated identity. Once you narrow down the universe of use-cases that currently depend on cross-site exchange of identifiers down to true "third-parties", the scale of required adoption dramatically drops. i.e. relying parties are in the order of millions, while identity providers are in the order of tens.

While the scale is likely not as dramatic for other third-party service providers (ad-tech, SaaS providers, CAPTCHA providers, etc.); I think it highlights how First-Party Sets makes the cross-site tracking problem more tractable.

krgovind commented 3 years ago

I don't think the page has to be blocked, just that embeds have thier cookies partitioned unless the browser has set an internal "consented" flag for an embed's origin.

@michael-oneill - Right. Since the flag depends on whether the user granted permission, this "consented" flag would be set either during the page load, or after the page is done loading, correct? We would need to notify the website when the flag value changes from Partitioned: 1 to Partitioned: 0. This is where an async API is helpful, because the developer can wait on a JavaScript promise to resolve/reject, at which point they may choose to re-load their cross-site content.


Responding to @johannhof

but it feels like a lot of the criticism it has seen is rooted in the fact that FPS isn't focused on exclusively trying to solve user-experienced breakage from lack of third-party cookies. You're also inviting origins that likely retain functionality when constrained to continue (or even start) sharing their cookies, such as Disney might be able to do with ESPN, to follow Martin's example.

Indeed. Whether we should place restrictions on FPS being used only for user-visible workflows is something we considered. We thought about enforcing "Common user journeys" as a UA policy requirement (see under "Alternatives Considered, but Discarded" on our UA Policy Proposal), but it is likely going to be difficult to verify. However, we could certainly reconsider it.

Also, looking across different browser's anti-tracking policies, Apple's ATT, and the DNT specification, I did not see this type of requirement articulated; so there was no precedence for me to look to.

For example, in my view, the set size vastly changes the implications to privacy and thus the treatment that a party requesting a set should receive. It would be interesting to know if small, unidirectional sets would fix the majority of use cases.

Yup! We are discussing set size limitations on #29 . I'd love to hear your thoughts there!

I want to echo this suggestion that @AramZS made earlier:

It might be useful to look at a break out session to step back, look at the specific issues that FPS intends to address, and other purposes it might be useful for, such as @johnwilander's interest, and move to addressing those more directly without being blocked by the additional concerns that FPS specifically creates?

We have documented the cross-site same-party use-cases we have heard about here, but there are others that we continue to hear about. For example, here are a couple of others to add to the list: the use of SaaS services across first-party sites #33 , and first-party SSO use-case that Safari is trying to solve with the use of FPS in their bounce tracking prevention solution.

But maybe FPS isn't built for that. It is, as you say, a new primitive for the web, not a tool like SAA. So maybe relaxation of storage requirements just shouldn't be part of the spec, and then we also don't need a prompt. Maybe the part that relaxes storage access should be specified separately (which is what the SAA tries to be, but maybe we can figure out something else).

Agreed. Intuitively speaking, specifying something separate from SAA makes sense to me. We're thinking about this, and will come up with something on #42.

michael-oneill commented 3 years ago

Responding to @krgovind

It would be useful for there to be a an async indication API vailable to the website, but the existing SAA requestStorageAccess (which returns a Promise) works only in an embedded browsing context. Are you suggesting a consent prompt initiated by the top level context? I suppose the embedded context could postMessage the top level when an SAA Promise resolves, would that work?

michael-oneill commented 3 years ago

In the DNT TPE, although the prompts were managed by the sites not the browser, we had both "web-wide" consent and "site-specific" consent. Site-specific consent meant DNT:0 headers were only sent to paricular subresource-site tuples i.e. equivalent to using a single valued partitioned cookie. This could then be actioned so that cross-domain tracking did not occur. It was silent on what that action would be, but the thought was that browsers could act on it e.g. by enforcing double-keyed (aka partioned) cookies.

Web-wide consent would enable full access i.e. the particular subresources indicated by the API would receive DNT:0 everywhere, irrespective of the top-level parent. This is similar to what SAA enables now.

A similar system based on FPS and SAA, using the FPS manifest to deliver transparency information, together with a top-level "SAA" type API for partioned cookies only, could be a big improvement, especially because now the prompts would be facilitated by and hopefully enforced by browsers.

johnwilander commented 3 years ago

Responding to @johnwilander

Exempting certain scenarios is not the right framing. This is about finding a path to prevent known cross-site tracking while not breaking legitimate use cases. Blocking, prompting, or intercepting all navigational redirects is not viable. We figured out in ITP 2017 that it's viable to delete all website data for sites the user doesn't interact with which fixed almost all bounce tracking that existed then. Bounce tracking with user interaction remains to be solved. Being able to reason about who's redirecting to whom based on something like FPS would open up a path to fix another chunk of the problem.

I think it's helpful to think about this from the perspective of what principles we are aiming at for the end-state. It helps us reason through our design decisions, but also to provide guidance for developers on how to migrate their websites.

User identifiers can be exchanged across sites via any number of mechanisms; including cross-site cookies, and top-level redirects. If we add more friction on one mechanism (cross-site cookies), but signal that it's okay to achieve the same functionality via a different mechanism (top-level redirects); that sends a confusing message to developers.

Again, that's the wrong framing. I'm not suggesting any kind of "signal that it's okay to achieve the same functionality via a different mechanism (top-level redirects)." Removing a tracking vector does not imply acceptance of other tracking vectors. We want to remove all tracking vectors and will do that as viable paths become available. Navigational tracking is a remaining tracking vector and this CG is about to get a work item to address it.

If we could have all developers across the web make changes in sync, we could explore much more complete solutions. But that's obviously never going to happen so we need to keep chipping away.

Does this mean that in the ideal/end-state, you would like to clamp down on navigational redirects even across sites within the same FPS? In other words, do you anticipate investing in Storage Access API or a permission-like solution for those as well? (I totally understand that you're suggesting we should chip away at these gradually, but I'm trying to understand the end-state)

The ideal end state is that there is no cross-site tracking and single sign-on and federated logins are under the user's control. We look at the whole spectrum of solutions such as IsLoggedIn, WebID, FPS, navigational tracking prevention, Storage Access API, and temporary or per-site mitigations to get us to the end state.

In concrete terms, we hope the combination of:

… will get us to a point where the browser can reason about what's happening and help the user log in to websites while not opening up for cross-site tracking beyond what the deliberate logins imply.

martinthomson commented 3 years ago

I get the sense that this discussion has fractured a little. I will try to avoid topics that are tangential and focus on important matters. (Expect one or more new issues to explore those tangents I think are relevant.)

@krgovind:

Just so I can understand this example better - is the concern that YouTube and GMail are separate applications that currently hosted on separate registrable domains? What about if they were hosted on the same registrable domain? Same origin? How do you draw the boundary of your understanding of the application/site's ability to seamlessly share state?

If the two were on the same registerable (registrable? my spelling checker likes neither) domain, there would be a clear signal that these are the same entity, expressed in unequivocal terms that both computers and people can understand.

If these were on the same domain, that would be a strong signal. I note that Google has chosen to unify domains for its search, maps, and mail products, but has not done so for other products. I can't pretend to know why those decisions were made, a range of factors likely contribute to any such decision.

I'm asserting that "seamlessly sharing state" as you put it is not always in the interest of users. It hasn't been a roaring success thus far.

FPS gives browsers an opportunity to highlight that shared ownership to users. This can be done both via requiring something like common branding on the website/content page itself, as well as within the browser UI.

Suggesting that "unified branding" might address this shortcoming isn't helpful, as that is not a cue that can be used by users at critical moments. Users (and their browsers) can (and do) make decisions based on just the information in a URL and you would deny them that option.

I'm not sure what this is getting at, it seems a bit convoluted to me. The way I see it, all of us, as stewards of the web platform, have a responsibility to ensure that developers can build the most secure and private experiences for end-users. If sandboxing content on domains is currently the only reliable way to deliver the most secure experience to users on the web platform, on old and new clients; why is it not fair to characterize that as directly beneficial to users? Are there well known alternative solutions that we should recommend instead?

I suspect that what you are pushing at here is the idea that we're creating an incentive to consolidate resources on a single registerable domain. That standing on principle is preventing people from deploying their services across different registerable domains on terms that suit them. That's not the case at all. I don't think that many of those cases are as intractable as is being made out.

There are cases (this lucid* one for example) where sites have used multiple domains to build a single "experience". In essence, there is a central service with a number of appendages that serve related purposes. Marketing seems to be the only additional purpose in that example, but I've seen more complex examples. In all of those cases, the stated goal might be to share state seamlessly.

But that again is the world view from the point of view of the owner of those sites. Let's say that you have a suite of apps that are spread across different domains. Let's say you have a specialized design suite for widgets at three different domains: cracked-connectors.example, flimsy-fasteners.example, and asymmetric-attachments.example, with each presenting a customized interface for a specific purpose.

Why would we deny a user the option of having separate interactions with each of those different sites? What information - aside from what you might be able to present to a user after their privacy expectations have been violated - might allow a user to know how their data will be treated ahead of following a link to those sites?

Actually, the concept of "party" within the context of anti-tracking on the web was created by many others before me. Mozilla included.

I will request (as others have on several occasions) that you stop pursuing this style of argumentation.

@johnwilander

Removing a tracking vector does not imply acceptance of other tracking vectors.

THIS.

That said, I'm still not following your overall vision here. The top-line is good. And I understand the bit where we have some temporary mess on the way to some end state. But you seem to have included a few things in that end state that I'm not sure are entirely justified. (I don't know where to take that discussion, as it is much broader, but I will open a new issue here to talk about the FPS connection.)

michael-oneill commented 3 years ago

@martinthomson

That said, I'm still not following your overall vision here. The top-line is good. And I understand the bit where we have some temporary mess on the way to some end state. But you seem to have included a few things in that end state that I'm not sure are entirely justified. (I don't know where to take that discussion, as it is much broader, but I will open a new issue here to talk about the FPS connection.)

The problem to solve is how to inhibit cross-domain tracking, without sacrificing the ability to implement valuable use cases.

Perhaps using FPS membership to remove partitioning on all cookies is too permissive. If persistent cookies or other storage always remain partioned, or at least untill an SAA prompt has ben accepted, but subresources who are members of a set can see their short-duration or session cookies, the trade-off becomes easier?

For requests sent to subresources within the set cookies that have been placed with no expiry, or an expiry less than 1 hour, are sent from a "global" cookie jar, all other cookies are sent from the partitioned jar.

The Partioned header if there is one could indicate that persistent storage was partitioned,, while this limited set of cookies may not be if the receiving subresource was in a FPS.

krgovind commented 3 years ago

A similar system based on FPS and SAA, using the FPS manifest to deliver transparency information, together with a top-level "SAA" type API for partioned cookies only, could be a big improvement, especially because now the prompts would be facilitated by and hopefully enforced by browsers.

@michael-oneill : Yup! I am indeed envisioning a prompt initiated by the top-level site, and I think we're aligned on how the semantics should work w.r.t. access to unpartitioned state vs. partitioned state.


Responding to @johnwilander

Again, that's the wrong framing. I'm not suggesting any kind of "signal that it's okay to achieve the same functionality via a different mechanism (top-level redirects)." Removing a tracking vector does not imply acceptance of other tracking vectors. We want to remove all tracking vectors and will do that as viable paths become available. Navigational tracking is a remaining tracking vector and this CG is about to get a work item to address it.

I think our differences on the framing of that question might be related to the fact that I am thinking of FPS as a comprehensive "privacy boundary". And when we perform the engineering/privacy analysis of such as construct, we do have to consider why it's an acceptable privacy boundary for one mechanism, and not the other. We are trained by exercises like the TAG S&P Questionnaire to think somewhat adversarially and assume that if something can be used to achieve a purpose, it will be used for that purpose.

However, based on your response, it sounds like you are thinking of FPS as a temporary/ad-hoc measure to iterate on bounce tracking protection; not a comprehensive boundary to determine "first-party" vs. "third-party".

  • FPS with login purpose information

We started an issue for this in #28 but it's a bit underspecified and we would love to hear from you on the issue about exact requirements, and how you envision it being concretely implemented. Some questions on my mind:


Responding to @martinthomson

Suggesting that "unified branding" might address this shortcoming isn't helpful, as that is not a cue that can be used by users at critical moments. Users (and their browsers) can (and do) make decisions based on just the information in a URL and you would deny them that option.

I agree that users (and browsers) should continue to make security decisions based on the URL. With privacy expectations though, I think we get tangled up in more human questions like "who is legally accountable for my data? do I trust them to know my history/preferences?". I am not an expert in those questions though, which is why I tend to look in places such as anti-tracking policies. What we're trying to do with FPS is to distill those human concepts into a primitive that browsers can understand. Browsers do use the registrable domain in the URL as the "privacy boundary" today, but IMO it's because that's the best approximation we have.

I suspect that what you are pushing at here is the idea that we're creating an incentive to consolidate resources on a single registerable domain. That standing on principle is preventing people from deploying their services across different registerable domains on terms that suit them. That's not the case at all. I don't think that many of those cases are as intractable as is being made out.

Yup, that was my line of thinking; specifically for the discussion about user benefit in cases where sites are deploying over multiple domains to ensure security guarantees. I do want to point out that sites support old clients, as well as new. So while I realize that newer clients have gotten better with providing origin-locked security guarantees (although I hear that even today, we still very much use registrable domain as a security boundary); my understanding is that not all clients have these in place.

Actually, the concept of "party" within the context of anti-tracking on the web was created by many others before me. Mozilla included.

I will request (as others have on several occasions) that you stop pursuing this style of argumentation.

I'm sorry, but I honestly don't remember ever having a discussion regarding the definition of "tracking" in Mozilla's anti-tracking policy (which I presume is intended to inform both short-term and long-term privacy work), and is based on "party" and not "registrable domain". From what I recall, previous discussions were related to Firefox's use of the Disconnect entities list, which I have been told is a temporary compat measure.

HarneetSidhana commented 3 years ago

The Microsoft Edge team generally believes that First-Party Sets are good for users.

The referenced W3C TAG Privacy Principles doc as linked to is still an in-progress doc that we should be cautious about referencing at this time. However, it's worth noting that its current definition of the "Vegas Rule" explicitly talks about the "first party" being "a party with which the user intends to interact," with "party" being further defined as including a set of entities that share a common owner that is readily evident.

Discussions around this proposal have necessarily taken on more discussion around potential UX than would normally be expected in a platform API discussion so that we can reason about how a user can be sufficiently aware of the party they're interacting with. We are convinced that reasonable browser UX can be built that helps inform users of common ownership, but that more work and exploration need to happen across browser implementers, including by Microsoft Edge.

Companies of all sizes, without a solution like First-Party Sets, will likely consolidate domains to continue supporting existing use-cases. We believe this is a worse user outcome, since all subdomains can not only share state and learn it’s the same user, but it will also deny users any ability to decouple state on an individual site basis. For instance in the example above, without FPS, a likely outcome is for the company that owns the three different domains: cracked-connectors.example, flimsy-fasteners.example, and asymmetric-attachments.example, to consolidate these domains into a single domain thereby denying users with any option to decouple state across different sites. With FPS, we can inform users about the relationship between these sites, and if a user wants to can opt out of FPS globally through a browser setting that clears all site data.

For smaller sites, there's an added challenge that, if they use services hosted by other entities (e.g. mysmallcompany.support-solution-provider.example hosted by a support solution provider), that they may be disadvantaged because of a potentially higher bar of setting up CNAME entries to point a subdomain to the service provider and increased cookie exfiltration risk for unrelated cookies.

For larger sites, having a larger number of site developers on a single domain (or, quite possible, a single origin) dramatically increases the risk of a security bug on one site causing a broader compromise than if they remained on separate domains.

There are real concerns around how we ensure interop and compat if not all UAs ship First-Party Sets with the same set of capabilities and policies for what can form a set. We support approaches that empower browsers to make their own decisions about if/how FPS is exposed without risking compatibility breaks with sites that choose to use FPS . The "UA Policy Proposal" is also of interest to us to ensure browsers that want to participate in using this approach can have a reasonable say on the criteria and governance for what's allowed to constitute a set.

I also see interest from others on non-cookie-related applications for this proposal. It would be useful to understand what aspects have more common implementer interest so that we can prioritize time on making progress on ensuring those can be achieved.

pes10k commented 3 years ago

We believe this is a worse user outcome, since all subdomains can not only share state and learn it’s the same user, but it will also deny users any ability to decouple state on an individual site basis

If the concern is that grouping domains will cause more data coupling / mixing than the site owners want, then (as mentioned on the other thread) lets focus on solving that problem, and give site authors ways of partitioning state under a single domain.

But if the concern is that grouping sites into a single domain will deny users control over state coupling, then, this seems exactly backwards. In browsers that block or partition 3p storage, FPS strictly reduces how much control users have over data coupling. Thats true w/ or w/o an increased % of sites being grouped on individual domains.

Put differently, the "FPS should not be used for storage boundary decisions" crowd has a clear story for how users can anticipate the data-sharing effects of clicking on a link before they click it (i.e., eTLD+1). I still have not heard how FPS-proponents plan on giving users this predictability (not how it could be done, but how they plan on doing it in the systems they control). Until theres a real proposal there, its just not accurate to say "FPS give users more control" over anything.

smaller sites… may be disadvantaged because of a potentially higher bar of setting up CNAME entries

This is very surprising. Does the Edge team envision it being easier to join / leave FPS than it is to edit a CNAME record? That doesn't seem compatible with whats (so far) envisioned by the referenced UA policy proposal. Presumably joining and leaving sets would need some level of friction if its to be predictable to users, no?

martinthomson commented 3 years ago

With FPS, we can inform users about the relationship between these sites, and if a user wants to can opt out of FPS globally through a browser setting that clears all site data.

I want to explore this assertion a little.

This implies that the proposal imagines an opt-out. That is, the most private option requires deliberate action on the part of individuals. It is little surprise then that so much depends on UX questions.

The defining characteristic of this proposal - at least from Google's perspective, and I infer Microsoft - seems to be that the seamless sharing of state across co-owned properties is the default. In that imagining, users need to recognize the threat this might pose to their privacy and take specific steps, either to avoid the sharing in the first place - which, though ideal, might be difficult to anticipate - or to break linkages after they have occurred. I assert that post facto corrections might be impossible: if the user previously had two distinct interactions and the sites were to irrevocably link those accounts, users might have no recourse.

I would greatly prefer if isolation occurred by default, with extraordinary action being required to make the connection. Connecting state by default is not in any user's interest.

At this point, it's probably worth addressing the argument that says that this produces incentive for sites to consolidate onto a single registerable domain. Firstly, it probably needs to be said: yes, this will add some incentive to consolidate. That is, if there wasn't already enough such incentive. But this is one small factor that might contribute to a complex decision that involves a bunch of other contributing factors. When the unification of state is just one small user prompt away (whether that be through storage access, a more comprehensible identity interaction, or a more targeted affordance), I don't see lack of FPS as a major decision factor. I know very well how little bumps can have a big impact on attrition, but our responsibility to users supersedes any obligation to help optimize funnels and so forth.

If the site owners wishes to project multiple identities, they should expect users to reciprocate. And the least we can do is provide users with some power to choose, rather than have the decision made for them.

michael-oneill commented 3 years ago

I agree the requirement for a prior opt-in rather than an option to opt-out should be the norm, as it legally already is in Europe (and increasingly other parts of the world).

But FPS works either way.

In an opt-in default environment the site must still ask for the user's consent before delivering content that tracks them, and that applies also to embedded content.

FPS enables cookies being shared across domains owned/managed by the same entity, which would also be the controller of any personal data being processed, which usually will also require the agreement of the user. But once the user has agreed to that they do not necessarily have to agree again on every site/domain in the FPS.

As for browser enforcement, the current majority (browser) status quo is no restrictions on third-party cookies. FPS is designed to create options for when restrictios, i.e. partitioning, of embedded access is implemented by all browsers, and some user-beneficial UX might bceome unavaliable.

The issue of opt-in versus opt-out is an important legal question, but the platform does not need to take a position on that - just ensure that both approches are equally possible.