Notify changes to consent signal

thovden commented 3 years ago

Happy to see ADPC - this is the right direction to go.

At Signatu we are developing and operating a consent platform for registering consent events from users and storing them in a immutable store (for e.g., audits, notifications to 3rd party systems, etc). This complements nicely the approach you are taking with ADPC.

However, since the signals are sent with every HTTP request, we need to store some state server side in order to determine whether we need to register changes to the user's choices. We could just set a cookie, of course, but I feel the ADPC should deal with this case explicitly. Explicit checks for each HTTP call is not going to be feasible.

A couple of approaches that come to mind:

Extend the spec to allow for a receipt ID from the server in a response header and add this ID to subsequent requests. Of course, this opens up for tracking, ref. #3
Add a changed flag in the ADPC header so the server will get a hint when it needs to persist changes, e.g., ADPC: withdraw=*, object=direct-marketing, changed=1

Any thoughts on this use case?

michael-oneill commented 3 years ago

I think the ID is a no-no because tracking, but the changed flag is a good idea. It could help limit the number of requests from servers and therefore reduce the number of prompts. ADPC: withdraw=*, changed=1 could indicate a refusal of a prompt.

coolharsh55 commented 3 years ago

I agree with the proposal and discussion (and also that IDs are problematic for tracking). However, I'm not clear whether changed=1 needs to be declared only once/first-time, and if so then how will the server understand what has changed (which is the initial question in the issue)?.

Since the ADPC does not transmit an identifier, the indication would be that it relies on the website to collect and handle that identity/identifier. For example, ADPC: consent=1 implies some consent granted to purpose 1. Does this mean it is the website's duty to create an identifier and assign it to this user, and if so, then where is that identifier stored locally for re-identification in subsequent visits? Is this taking us back to cookies for storage? Similar example, ADPC: withdraw=1 implies withdrawal of prior consent, but how to identify which user this relates to without an identifier?

michael-oneill commented 3 years ago

The server can always see what this user has agreed to, if anything, in the ADPC header, which it receives in every HTTP request. It does not need to identify the user, all the state needed is in the header.. Of course legally any browser recorded state needs either an exemption or user agreement, but a browser managed consent signal and protocol such as ADPC implies state is recorded in the browser. The point of @thovden's proposal as I understand it is not to require extra state recorded in a cookie or other storage, which would be problematic as it would neeed an exemption under ePrivacy, inevitably creating a loophole. Keeping it all in the ADPC header allows us to minimise the entropy

coolharsh55 commented 3 years ago

Yes, the server can see the preferences in the request contents, but not who has set those preferences. So if a preference is changed (assume withdraw or permission set to prohibition), how will the server/website/controller interpret it and decide which user's data processing must be stopped? So I disagree that all state needed is in the header, since the header (or request contents) do not contain an identifier. So how does the server understand which user has given that preference or changed it? The request ids may themselves also be used as identifiers for the user, but this is not a feasible solution, e.g. when using standardised request ids. See Issue #3 for this discussion regarding tracking using ids.

michael-oneill commented 3 years ago

All an internet protocol knows about are endpoints, The person using an endpoint has to be inferred, how else would you do it? Remember cookies are headers too, communicated along with all the others. Creating a new header allows you to place retrictions on its ability to be used as a tracker. its operation should not require futher state to be recorded or signalled.

thovden commented 3 years ago

On subject IDs - server-side consent managers like Signatu must map a HTTP request to a subject ID in order to store the consent event and enable retrieval of the event later, e.g., on other devices or other browser sessions. So if we're not getting an ID with the ADPC header we will need to get it from somewhere else, e.g., a cookie. If the user agent and server handles subject IDs is outside of the ADPC spec, and we'll probably get a lot of different approaches for keeping track of a subject ID, and new creative forms of tracking.

For example - in the Incognito case, or where the user has explicitly said "do not track" - I assume the header will be something like ADPC: withdraw=*. In that case should the consent manager persist the consent signal? Will it be OK to try to map an incognito user (server side) to a subject ID for consent purposes?

If instead we add a subject and a receipt ID to the headers we can e.g., signal that we don't want any persistence like ADPC: withdraw=*,subject=anon. In this case it would not be permitted for the server to store this signal.

The interaction would be something like this:

# client -> server
ADPC: consent="q1analytics q2recommendation"; changed=1; subject=2ec8a6e23cab
# server responds with a receipt ID - i.e., server has understood and registered the signals. 
ADPC-RECEIPT-ID: cecd5bce-aa15-48d4-a792-ffc9ddb7640d

# client -> server - no need for any subject IDs if signals have not changed
ADPC: withdraw=*; receipt=cecd5bce-aa15-48d4-a792-ffc9ddb7640d; changed=0;
# ...or just provide the receipt ID which the server will understand as withdraw=*
ADPC: receipt=cecd5bce-aa15-48d4-a792-ffc9ddb7640d; 

# client -> server - when changed=1 provide subject. Receipt is not required
ADPC: withdraw=*; subject=2ec8a6e23cab; changed=1;
# server responds 
ADPC-RECEIPT-ID: df72adcf-f653-4590-98bc-5fd539908295

# client -> server
ADPC: withdraw=*; subject=anon; changed=1;
# server will not record the consent signal and provides no receipt

This can be used for tracking #3 of course, but so could any cookie set on the site origin domain. This approach makes subject ID handling explicitly defined under ADPC. So how do we prevent the ADPC header to be misused for tracking? It will never be fool-proof, but I guess we can:

Set the subject and receipt parameters only for HTTP requests to the site origin, remove them for all other requests.
Extend CSP with a consent-src permission that will allow the subject and receipt to be shared with named 3rd party domains (typically CMPs).

So in summary, I'm looking for a way to explicitly deal with a) the subject ID, and b) giving the user agent a receipt ID that indicates that the signal has been registered for the subject ID. The user agent could allow the user later to review their consent receipts, and revoke permissions in a user agent dashboard and so on. What the subject ID should be is up to the user agent and/or the client side code - we can imagine user agents assigning new IDs regularly based on user preference, for example.

These are not fully formed thoughts and I'm sure there are many issues with the proposed approach above.

robrwo commented 3 years ago

If the consent request were stored in a well-known location #9 then the browser can check the cache and look for updates. (If the Last-Modified date is newer than the last consent withdrawn/accepted date, then it's been updated and the UA can notify the user.)

coolharsh55 commented 3 years ago

Storing consent requests in .well-known will not work if the consent request is not uniform for all cases, but is specific for certain cases or individuals.

michael-oneill commented 3 years ago

For the sake of transparency consent request text strings should be viewable by all. Not only should they be restricted to a .well-known location, the browser should strip Cookie headers from the HTTP request.

robrwo commented 3 years ago

Storing consent requests in .well-known will not work if the consent request is not uniform for all cases, but is specific for certain cases or individuals.

Making consent strings user-dependent opens it up for misuse, e.g. #3.

gb-noyb commented 3 years ago

Thanks for opening this discussion. It is good to hear the views from implementers of consent management software, as you may valid practical issues that we might have overlooked. To the original question:

Any thoughts on this use case?

Some thoughts that pop up (not fully well-formed either; apologies for this unstructured and inconclusive rain of bullet points):

In many cases, it may not be needed to store the decisions on the server: the user repeats their decision each visit, so the website can conditionally perform the processing depending on whether the user consented to it (e.g. count the visit or recommend content based on their IP address, etc.). See section 7.2:

The user agent SHOULD repeat the ADPC header with every HTTP request it makes to the website, as long as it is applicable. The repetitions enable a website to know the user’s decision without keeping records itself.
Generally, we should try to avoid creating extra personal data processing (e.g. assigning people identifiers) merely for consent management. It would be ironic to have a privacy protection mechanism where, for example, you’d have tell the website who you are in order for it to know whether you consented to be tracked for anonymous statistics.
For cases that are not such a one-off processing task (e.g. building and using a profile of the user’s interests), storing some records of personal data would be part of the processing, and upon withdrawal of consent the website would need to know which data to stop processing and erase. In most (all?) such cases however, the website would already have some type of identifier for the user (e.g. profiling interests would not be possible otherwise), usually stored in a cookie. For which cases would we then need yet another identifier?
Part of your issue is about making it simpler for the web server to know whether it needs to act when receivnig an ADPC header: “Explicit checks for each HTTP call is not going to be feasible.” I would like to understand better what the problems and options there are; why exactly updating a variable is considered infeasible.
Withdrawing consent could already be used as a ‘change’ signal. In theory one should not have to repeat it after having passed it to the server once (though we don’t have an acknowledgement of receipt, one would have to trust that the server received and processed it).
The suggestion in the issue title, “Notify changes to consent signal”, seems simple on first impression, but on second thought it requires defining compared to what a signal has changed. Compared to a previous visit with the same cookie jar? (e.g. the same person may switch to another “context” or “private browsing” mode; effectively appearing as a different person)
It looks like the conversation has (therefore?) moved more to the idea of incorporating identifiers for the subjects (or the individual transactions); with an identifier you can indeed more easily know which person changed their decisions. But again, it would be great if we do not have to identify people to apply their decisions.
Note that all this is very related to the spec’s section about “Personal scope”:

The same person may or may not be recognisable to the website on a subsequent visit (for example when the user deletes stored IDs or uses another device or account), and may thus be considered a new user from the website’s perspective.

The scope of the user’s exercise of rights is therefore limited to any personal data and information that relates to the user present in any transaction.
If a user once gave consent to being profiled, then deletes their cookies, then wants to withdraw their consent — then indeed there will be no way to tell the website which profile to delete, if the cookies were the person’s only identifier. I don’t know if this is a grave problem; at least it does not seem to be different from the status quo.
If we find that it is in fact a problem, I do see some appeal in having a ‘receipt number’ (programmers may think of it as an object capability or callback function) for each consent one has given; it would however not be an identifier for the subject, and it would not be passed to the website except when withdawing the consent. This would both ensure one can withdraw consent without needing to keep cookies etc, and make it easier for the website to know which data to erase. (see also #16)

robrwo commented 3 years ago

I that explicit checks for every HTTP request are not feasible. As I've noted in #9, adding ADPC headers in the requests and responses will increase the size of requests/responses. Web pages often include many embedded resources (images and other media, scripts, stylesheets) so this can easily add a few hundred bytes to each request.

For users on mobile networks, or with slow internet connections, or even busy/overloaded networks, this is a real performance concern.

Likewise, processing ADPC headers can affect performance. Adding a few milliseconds of processing time for pages adds up to an extra second of processing for a few hundred pages.

Speed performance affects not just SEO but (perhaps more importantly) cost for cloud services where processing time and network bandwidth are metered.

If you want developers to use the protocol, then you need to consider ways to streamline the protocol so that

it has a negligible affect on user agents that do not support ADPC
headers should only be sent in requests to query ADPC first time in a site, to check for changes, or to submit changes
website should only send ADPC responses when user agent indicates that it supports ADPC

Data-Protection-Control / ADPC

Notify changes to consent signal #6