Data-Protection-Control / ADPC

Advanced Data Protection Control (ADPC) is a mechanism to communicate data subjects' (users') consent and privacy decisions with data controllers (service providers).
http://dataprotectioncontrol.org
Mozilla Public License 2.0
48 stars 6 forks source link

Notify changes to consent signal #6

Open thovden opened 3 years ago

thovden commented 3 years ago

Happy to see ADPC - this is the right direction to go.

At Signatu we are developing and operating a consent platform for registering consent events from users and storing them in a immutable store (for e.g., audits, notifications to 3rd party systems, etc). This complements nicely the approach you are taking with ADPC.

However, since the signals are sent with every HTTP request, we need to store some state server side in order to determine whether we need to register changes to the user's choices. We could just set a cookie, of course, but I feel the ADPC should deal with this case explicitly. Explicit checks for each HTTP call is not going to be feasible.

A couple of approaches that come to mind:

Any thoughts on this use case?

michael-oneill commented 3 years ago

I think the ID is a no-no because tracking, but the changed flag is a good idea. It could help limit the number of requests from servers and therefore reduce the number of prompts. ADPC: withdraw=*, changed=1 could indicate a refusal of a prompt.

coolharsh55 commented 3 years ago

I agree with the proposal and discussion (and also that IDs are problematic for tracking). However, I'm not clear whether changed=1 needs to be declared only once/first-time, and if so then how will the server understand what has changed (which is the initial question in the issue)?.

Since the ADPC does not transmit an identifier, the indication would be that it relies on the website to collect and handle that identity/identifier. For example, ADPC: consent=1 implies some consent granted to purpose 1. Does this mean it is the website's duty to create an identifier and assign it to this user, and if so, then where is that identifier stored locally for re-identification in subsequent visits? Is this taking us back to cookies for storage? Similar example, ADPC: withdraw=1 implies withdrawal of prior consent, but how to identify which user this relates to without an identifier?

michael-oneill commented 3 years ago

The server can always see what this user has agreed to, if anything, in the ADPC header, which it receives in every HTTP request. It does not need to identify the user, all the state needed is in the header.. Of course legally any browser recorded state needs either an exemption or user agreement, but a browser managed consent signal and protocol such as ADPC implies state is recorded in the browser. The point of @thovden's proposal as I understand it is not to require extra state recorded in a cookie or other storage, which would be problematic as it would neeed an exemption under ePrivacy, inevitably creating a loophole. Keeping it all in the ADPC header allows us to minimise the entropy

coolharsh55 commented 3 years ago

Yes, the server can see the preferences in the request contents, but not who has set those preferences. So if a preference is changed (assume withdraw or permission set to prohibition), how will the server/website/controller interpret it and decide which user's data processing must be stopped? So I disagree that all state needed is in the header, since the header (or request contents) do not contain an identifier. So how does the server understand which user has given that preference or changed it? The request ids may themselves also be used as identifiers for the user, but this is not a feasible solution, e.g. when using standardised request ids. See Issue #3 for this discussion regarding tracking using ids.

michael-oneill commented 3 years ago

All an internet protocol knows about are endpoints, The person using an endpoint has to be inferred, how else would you do it? Remember cookies are headers too, communicated along with all the others. Creating a new header allows you to place retrictions on its ability to be used as a tracker. its operation should not require futher state to be recorded or signalled.

thovden commented 3 years ago

On subject IDs - server-side consent managers like Signatu must map a HTTP request to a subject ID in order to store the consent event and enable retrieval of the event later, e.g., on other devices or other browser sessions. So if we're not getting an ID with the ADPC header we will need to get it from somewhere else, e.g., a cookie. If the user agent and server handles subject IDs is outside of the ADPC spec, and we'll probably get a lot of different approaches for keeping track of a subject ID, and new creative forms of tracking.

For example - in the Incognito case, or where the user has explicitly said "do not track" - I assume the header will be something like ADPC: withdraw=*. In that case should the consent manager persist the consent signal? Will it be OK to try to map an incognito user (server side) to a subject ID for consent purposes?

If instead we add a subject and a receipt ID to the headers we can e.g., signal that we don't want any persistence like ADPC: withdraw=*,subject=anon. In this case it would not be permitted for the server to store this signal.

The interaction would be something like this:

# client -> server
ADPC: consent="q1analytics q2recommendation"; changed=1; subject=2ec8a6e23cab
# server responds with a receipt ID - i.e., server has understood and registered the signals. 
ADPC-RECEIPT-ID: cecd5bce-aa15-48d4-a792-ffc9ddb7640d

# client -> server - no need for any subject IDs if signals have not changed
ADPC: withdraw=*; receipt=cecd5bce-aa15-48d4-a792-ffc9ddb7640d; changed=0;
# ...or just provide the receipt ID which the server will understand as withdraw=*
ADPC: receipt=cecd5bce-aa15-48d4-a792-ffc9ddb7640d; 

# client -> server - when changed=1 provide subject. Receipt is not required
ADPC: withdraw=*; subject=2ec8a6e23cab; changed=1;
# server responds 
ADPC-RECEIPT-ID: df72adcf-f653-4590-98bc-5fd539908295

# client -> server
ADPC: withdraw=*; subject=anon; changed=1;
# server will not record the consent signal and provides no receipt

This can be used for tracking #3 of course, but so could any cookie set on the site origin domain. This approach makes subject ID handling explicitly defined under ADPC. So how do we prevent the ADPC header to be misused for tracking? It will never be fool-proof, but I guess we can:

So in summary, I'm looking for a way to explicitly deal with a) the subject ID, and b) giving the user agent a receipt ID that indicates that the signal has been registered for the subject ID. The user agent could allow the user later to review their consent receipts, and revoke permissions in a user agent dashboard and so on. What the subject ID should be is up to the user agent and/or the client side code - we can imagine user agents assigning new IDs regularly based on user preference, for example.

These are not fully formed thoughts and I'm sure there are many issues with the proposed approach above.

robrwo commented 3 years ago

If the consent request were stored in a well-known location #9 then the browser can check the cache and look for updates. (If the Last-Modified date is newer than the last consent withdrawn/accepted date, then it's been updated and the UA can notify the user.)

coolharsh55 commented 3 years ago

Storing consent requests in .well-known will not work if the consent request is not uniform for all cases, but is specific for certain cases or individuals.

michael-oneill commented 3 years ago

For the sake of transparency consent request text strings should be viewable by all. Not only should they be restricted to a .well-known location, the browser should strip Cookie headers from the HTTP request.

robrwo commented 3 years ago

Storing consent requests in .well-known will not work if the consent request is not uniform for all cases, but is specific for certain cases or individuals.

Making consent strings user-dependent opens it up for misuse, e.g. #3.

gb-noyb commented 3 years ago

Thanks for opening this discussion. It is good to hear the views from implementers of consent management software, as you may valid practical issues that we might have overlooked. To the original question:

Any thoughts on this use case?

Some thoughts that pop up (not fully well-formed either; apologies for this unstructured and inconclusive rain of bullet points):

robrwo commented 3 years ago

I that explicit checks for every HTTP request are not feasible. As I've noted in #9, adding ADPC headers in the requests and responses will increase the size of requests/responses. Web pages often include many embedded resources (images and other media, scripts, stylesheets) so this can easily add a few hundred bytes to each request.

For users on mobile networks, or with slow internet connections, or even busy/overloaded networks, this is a real performance concern.

Likewise, processing ADPC headers can affect performance. Adding a few milliseconds of processing time for pages adds up to an extra second of processing for a few hundred pages.

Speed performance affects not just SEO but (perhaps more importantly) cost for cloud services where processing time and network bandwidth are metered.

If you want developers to use the protocol, then you need to consider ways to streamline the protocol so that

  1. it has a negligible affect on user agents that do not support ADPC
  2. headers should only be sent in requests to query ADPC first time in a site, to check for changes, or to submit changes
  3. website should only send ADPC responses when user agent indicates that it supports ADPC