privacycg / nav-tracking-mitigations

Navigation-based Tracking Mitigations
https://privacycg.github.io/nav-tracking-mitigations/
31 stars 14 forks source link

Clarity in "Navigational-Tracking Mitigations" regarding #43

Closed judielaine closed 1 year ago

judielaine commented 1 year ago

I am possibly missing some precision here.

I'm considering the SSO usecase where a dedicated host, federate.example.com, mediates the authentication for site1.example.com. In some use cases, federate.example.com never has a user interaction. federate.example.com may choose to set cookies at example.com and at federate.example.com. If i understand the domain match RFC and the comparison being made in §6.2.3. Deletion "Let cookieList be the set of cookies from the cookie store whose domain attribute is a domain-match with host." when federate.example.com is the host, and its hour is up, only cookies set as federate.example.com are a a domain-match with host; the cookies at example.com persist.

What happens when hosts somehost.tracking.evil set cookies at tracking.evil?

Thanks for your clarification; my regrets if i am confusing domains and hosts.

image

Image plant uml ` rectangle "site1.example.com" as sp rectangle "federate.example.com" as proxy rectangle "identity.provider" as idp database "state 0" as idpState database "state 1" as proxyState database "example.com\ncookies" as domainCookies

sp -> proxy: 1.0: navigate proxy ..> proxyState: 1.1: check\ncookie proxy -[#red,thickness=4]> sp: redirect\nif State proxy -> idp: 1.2: navigate idp ..> idpState: 2: check\ncookie idp -> proxy: 3.0: redirect proxy ..> proxyState: 3.1: set\ncookie proxy -> sp: 3.3: redirect sp ..> domainCookies proxy ..> domainCookies `

wanderview commented 1 year ago

I haven't read this fully, but is this about the in progress bounce tracking spec being worked on here?

https://privacycg.github.io/nav-tracking-mitigations/#bounce-tracking-mitigations

If so, I think I may have a bug where I'm not properly handling etld+1 hosts. The intent is to treat a.foo.example, b.foo.example, etc all under the same key.

Or is this about a different part of the document?

judielaine commented 1 year ago

Given the hosts in my diagram (assuming i am following definitions correctly)

and given the definition in 6.1 of a user activation map & candidate bounce tracking map, both of which are a map of site hosts to moments, I assume the result of the TODO and 6.2.1 would populate the two maps with the hosts in the diagram.

As a concrete example, consider the following entries; I abbreviate the moments. In the context of an SSO flow as diagrammed, with the assumption the user had interacted at the identity.provider earlier in the day,

user activation map

identity.provider -> 08:00:00
site1.example.com -> 10:03:00

candidate bounce tracking map

federate.example.com -> 10:01

Meanwhile, the cookies in the example could be, raising my question about cookies set at the domain.

identity.provider, sessionId, 111
example.com,mode, night
site1.example.com, sessionid, 222
federate.example.com, sessionId,333
federate.example.com,,idpChoice,identity.provider

So my question is which cookies go at 10:01? Just the two for 'federate.example.com' or all involved with 'example.com'. (And if it's the former, presumably federate just sets all cookies at example.com, in response to the change.)

But perhaps your intent (from your bug comment) is that the maps are at the (registerable?) domain level, the eTLD+1 level, what i understand to be the broadest domain a cookie can be set at? Which would mean that because the user interacted at site1.example.com, a hop through federate.example.com would not be identified as a candidate for bounce tracking.

Thanks!

wanderview commented 1 year ago

Correct, federated.example.com and site1.example.com would be considered same-site, and there federated.example.com would not be considered a possible bounce tracking candidate.

Sorry for the confusion in the spec. Its very much a work-in-progress and not really ready for consumption yet.

wanderview commented 1 year ago

I'm adding a note to double-check the behavior of [=domain-match=], but note the current spec does actually use etld+1 correctly in the user activation map. The host is obtained by using html spec's obtain a site algorithm which gets takes the public suffix list into account.

The spec to populate the bounce map is not written yet, but I'll make sure to have it use the same algorithm.

wanderview commented 1 year ago

I think this has been resolved by recent spec updates. We should be using etld+1 sites consistently now.