w3c / activitypub

http://w3c.github.io/activitypub/
Other
1.21k stars 77 forks source link

Optional http-equiv for objects, to express user opt outs and/or preferences #403

Closed dmarti closed 1 month ago

dmarti commented 9 months ago

Some users have concerns about how their content or personal info is used. For example, some users do not want the content they created to be used for training generative AI systems, and some users do not want to have their personal information shared or sold.

In centralized systems, a user can indicate preferences and/or opt outs with an HTTP header on their own client, or apply settings to their own account on a server to have the HTTP header set by that server. However, in ActivityPub a user's content is federated across servers, so an HTTP header set by the user or on their home instance would be visible only in the original session.

Another problem with trying to standardize a set of preferences and opt outs for ActivityPub would be keeping a list up to date with all the available possible choices for all users. It's hard to know in advance what options a user will choose to exercise.

One possible way to make it work would be to extend section 3.1 of the spec, to allow information equivalent to HTTP headers to travel along with an object.

An object MAY have an http-equiv property, which is a list of header name-value pairs. A server MUST process that object as if it had been received over an HTTP connection with that header name set to that value.

For example, a user who chose to express a preference that their content not be used for AI training, and that their information not be shared or sold, could send something like this:

{
  "@context": "https://www.w3.org/ns/activitystreams",
  "type": "Note",
  "content": "Don't surveil me bro",
  "http-equiv": [{"X-Robots-Tag": "noai"}, {"Sec-GPC": 1}]
}
snarfed commented 9 months ago

Interesting idea! I like it.

I suspect we'd still end up wanting to maintain some kind of allowlist, since it would be misleading to see many other existing HTTP headers there, eg Content-Type, Cache-Control, Date, Host, Location, Authorization, etc. Unlikely, granted, until some misguided implementation starts to copy in headers from elsewhere. Minor nit though.

dmarti commented 9 months ago

@snarfed Thank you. It might be possible to avoid the registry by saying this only applies to options available to users, and does not apply to HTTP headers that a user can't change. Maybe something like...

For each header name, if the header communicates an option, preference or opt-out that is available to a user of a server and documented in the server's user documentation and/or privacy policy, then the server MUST process the object as if it had been received over an HTTP connection with that header name set to that value.

If a server doesn't do data sharing/selling, so doesn't check the GPC header for local users, then it can ignore the http-equiv entry too.

I'm not opposed to having a registry, though, and would volunteer to help maintain it if that's the best way to make this work.

evanp commented 5 months ago

First of all @dmarti , this is an interesting proposal for bringing some of the important work in privacy control and consent for processing to the fediverse.

In our discussion in issue triage, we think that the breadth of the solution is too far. In particular, there are so many headers in HTTP, and only a small minority would be applicable for this use.

We think it makes more sense to create a FEP (Federation Enhancement Proposal) for the specific terms like X-Robots-Tag that could be native JSON-LD properties instead of name-value pairs. The proposal could hew very closely to the existing specs for HTTP headers and even reference those specs. So, your example might look more like the following:

{
  "@context": ["https://www.w3.org/ns/activitystreams", "https://fep.example/ns/privacyHeaders"],
  "type": "Note",
  "content": "Don't surveil me bro",
  "xRobotsTag": "noai",
  "secGPC": 1
}

There are also some tags in use from the Mastodon namespace, including indexable, discoverable, and so on, that might map clearly onto this use case. So it's worth comparing those. ODRL might also be a solution. OcapPub might also be applicable here. CCRDF might also apply.

Given this advice, I'm going to mark this issue as Needs FEP and leave it open for comment pending closure.

dmarti commented 5 months ago

Update: FEP at https://codeberg.org/fediverse/fep/src/branch/main/fep/5e53/fep-5e53.md

evanp commented 1 month ago

I believe this FEP follow the suggestions of the original issue and it looks like the process is moving along well. It makes sense to close this issue and I'll close.

dmarti commented 1 month ago

Thank you, @evanp -- there is now a discussion thread open for opt-out preference signals, here: https://socialhub.activitypub.rocks/t/fep-5e53-opt-out-preference-signals/4323