w3c / activitypub

http://w3c.github.io/activitypub/
Other
1.18k stars 75 forks source link

as:Authenticated proposal #339

Closed kaniini closed 3 weeks ago

kaniini commented 5 years ago

Description

This issue proposes adding a new security label to ActivityPub alongside as:Public called as:Authenticated.

The as:Authenticated security label would behave like as:Public but, in a compliant implementation, only allows any authenticated user to view the object.

The as:Authenticated security label would be represented over the wire as https://www.w3.org/ns/activitystreams#Authenticated.

Motivation

Recently there have been a few high profile incidents involving wholesale archival of fediverse instances. This has lead to many users expressing unhappiness as they had "buyer's remorse" considering their decision to label posts as as:Public.

Discussion frequently reveals that users commonly fall back to as:Public instead of sending to their followers collection due to motivations of reaching a wider audience of fediverse participants, but do not consider the possibility of permanent archival.

By introducing a new security label, as:Authenticated we introduce an alternative to as:Public for users who wish to make posts that any fediverse participant may interact with which excludes unauthenticated guest users.

JSON-LD Context

{
  "@context": [
    {"as": "https://www.w3.org/ns/activitystreams"},
    {"Authenticated": "as:Authenticated"}
  ]
}

Security Considerations

Rogue Implementations

This section is non-normative.

The Fediverse is an open world network built on top of the ActivityPub and related standards. Accordingly, an implementation may choose to be intentionally non-compliant with certain aspects of the protocol, such as not enforcing security labels or limiting access to specified collections. This threat is not changed by this proposal, but a compliant server would ideally provide a way to make trust assertions about it's peers which includes limiting what objects are shared with those peers.

Authenticated Fetches

When as:Authenticated objects are Announced through the network, they should be referenced by their id. The recipient of the Announce MUST make an authenticated fetch using HTTP Signatures using a key belonging to an actor on the server. The server would ideally use a server-specific actor's key instead of a user's key if one is present.

The server which hosts the as:Authenticated object being fetched SHOULD verify that the signature belongs to a key which belongs to an instance that is authorized to fetch the object. The specific logic for determining whether an authenticated fetch is allowed is unspecified and implementation dependent.

jonaharagon commented 5 years ago

I saw a discussion about this on Mastodon and I have to admit, I don't really understand how this will prevent bots from scraping profiles. Bot operators will trivially spin up their own servers or register on a bot-friendly server and authenticate themselves to other servers though that.

nightpool commented 5 years ago

I'm tentatively in favor of this although I think we need a more rigorous definition of what a "fediverse participant"/"authenticated user" is. An as2 client? an actor? a Person actor?

One might consider this proposal "kicking the can down the road" as far as archiving or other forms of public access goes—what happens when archival crawlers become as2 Services? It still leaves it up to the server host to handle restricting access by the appropriate actors, without explicit targeting from the user. (Something something Block...?)

Also, right now you have a MUST in a non-normative section, which is needlessly confusing. If that section is truly non-normative, it should probably be reworded to remove the MUST/SHOULD/etc language. If you consider that section normative, which it seems from the wording that you do, you should mark it as such and move it somewhere else.

bhtooefr commented 5 years ago

@JonahAragon There's plenty of ways that ActivityPub privacy controls are suggestions - a bot operator could spin up their own instance or a followbot to capture followers only posts, and there's even some capture of direct messages that could be performed (IIRC there's one mobile client that sends everything it sees to its own search server). Fundamentally, you have to trust whoever you're federating with.

This proposal would at least let well-meaning bot operators know that a post is not intended to be archived publicly, reducing ambiguity.

jonaharagon commented 5 years ago

Fundamentally, you have to trust whoever you're federating with.

This is kind of my point. I think users just need to understand that posts on a social media platform — especially posts marked as public — are indeed public, and should be prepared to deal with the consequences of that.

This proposal would at least let well-meaning bot operators know that a post is not intended to be archived publicly, reducing ambiguity.

I hadn't considered this and I suppose this may be true, but it seems like archive groups are generally undeterred by things like rate limits against their IPs/user agents/etc (which are also clear indicators that their efforts are not exactly welcome in my opinion), and I don't think it would stop them from running their bots through a compliant server to be properly "authenticated".

I can envision scenarios where this flag might be useful so I'd probably be in favor of adding it. I just think it's being proposed in response to a single incident, and I don't think this flag would prevent something like that happening again in the future. My main concern is that I think this proposal will provide a false sense of security/privacy to users because even if well-meaning bot operators complied with the flag in a way we evidently expect them to, there are probably plenty of operators who wouldn't, and nothing about this proposal would technically stop them.

trwnh commented 5 years ago

To be clear, you can currently do the following:

What this proposal does is effectively two things:

Of course there is nothing technically stopping you from republishing an email that you receive to your private inbox, either. But that still relies on you doing the republishing. This change would make AS2 objects less promiscuous and also generally auditable (since you can now log fetches by more than just IP address).

kaniini commented 5 years ago

I'm tentatively in favor of this although I think we need a more rigorous definition of what a "fediverse participant"/"authenticated user" is. An as2 client? an actor? a Person actor?

I think we should define it as an actor, or account associated with an actor. We need the proposal to be reasonably flexible while allowing users to control what shows up to guests on their profiles, etc.

One might consider this proposal "kicking the can down the road" as far as archiving or other forms of public access goes—what happens when archival crawlers become as2 Services? It still leaves it up to the server host to handle restricting access by the appropriate actors, without explicit targeting from the user. (Something something Block...?)

Well, I think Mastodon and Pleroma both agree that Block activities exist in S2S. I think it is a reasonable interpretation that is not well reflected in the official spec, but we can grumble about it another time. However, I think that an instance admin should have mechanisms that allow for proactively blocking individual actors from fetching any objects, and due to the leakage that is inherently caused by multi-tenancy, an instance admin should be able to block entire instances deemed hostile. I'm not sure how to word that in the proposal though. Advice would certainly be welcomed.

Also, right now you have a MUST in a non-normative section, which is needlessly confusing. If that section is truly non-normative, it should probably be reworded to remove the MUST/SHOULD/etc language. If you consider that section normative, which it seems from the wording that you do, you should mark it as such and move it somewhere else.

I solved that by making the rogue instances section non-normative instead of the full security considerations section.

Fundamentally, you have to trust whoever you're federating with.

This is kind of my point. I think users just need to understand that posts on a social media platform — especially posts marked as public — are indeed public, and should be prepared to deal with the consequences of that.

The problem is that ActivityPub effectively only offers a formal binary: public or followers collection. Technically, the protocol can target any arbitrary collection, but in practice, this is how the implementations work. as:Authenticated solves this by introducing a second targetable security label which is backward compatibility with current implementations.

This proposal would at least let well-meaning bot operators know that a post is not intended to be archived publicly, reducing ambiguity.

This proposal is useful for bot operators, but ideally is meant to allow for exclusion of object visibility from unauthenticated scrapers. Authenticated scrapes can be handled in a different way on a software level and don't require an extension to ActivityPub.

I hadn't considered this and I suppose this may be true, but it seems like archive groups are generally undeterred by things like rate limits against their IPs/user agents/etc (which are also clear indicators that their efforts are not exactly welcome in my opinion), and I don't think it would stop them from running their bots through a compliant server to be properly "authenticated".

Right now Archive Team does not implement ActivityPub. It is my desire that Archive Team create an archival tool which does speak ActivityPub and appropriately respects as:Authenticated objects by, in their case, ignoring them. This would allow anyone who wants their posts to be archived to opt-in to such archival (and enable such archival in real time), while protecting the average user from posting things publicly. In Pleroma, it is already possible to stream only public objects to an archival instance of this nature.

However, Archive Team uses a web crawler called Warrior. Warrior is just a normal http client, and as:Authenticated does mitigate this situation as the labelled posts would not be included on any generated HTML pages scraped by the crawler.

With that said, you are right that this would not stop all cases. And also, it should be noted that nothing stops somebody from screenshotting posts and archiving that way. But a solution does not need to cover edge cases in order to be useful, and with the right kind of interaction with the archival community, we can probably get them to comply with the wishes of our own community.

I can envision scenarios where this flag might be useful so I'd probably be in favor of adding it. I just think it's being proposed in response to a single incident, and I don't think this flag would prevent something like that happening again in the future. My main concern is that I think this proposal will provide a false sense of security/privacy to users because even if well-meaning bot operators complied with the flag in a way we evidently expect them to, there are probably plenty of operators who wouldn't, and nothing about this proposal would technically stop them.

To be absolutely clear, as:Authenticated is not a flag, but a security label like as:Public. It refers to an implementation-defined collection of "authenticated users" (a more specific definition is being worked out still) that receive an activity. We do not expect hostile implementations to comply with this security label, but as it is a security label that refers to an implementation-defined collection of targets, the admin can choose to exclude hostile implementations they are aware of from that collection. To be absolutely clear though, this is not a flag, but a security label like as:Public. A message would look like this:

{
   "@context": "https://www.w3.org/ns/activitystreams",
   "to": ["https://www.w3.org/ns/activitystreams#Authenticated"],
   "cc": ["https://example.com/users/foobar/followers"],
   "id": "https://example.com/users/foobar/activities/2h3bhbr2bhj42hjb42b214b",
   "type": "Create",
   "object": {
      "id": "https://example.com/users/foobar/notes/hello-world",
      "type": "Note",
      "content": "Hello world!"
   }
}
kaniini commented 5 years ago

To be absolutely clear, the purpose behind as:Authenticated is to make as:Public literally public again.

The intent is to allow users to choose a default of as:Authenticated or their own followers collection as their default target while allowing them to post things that they absolutely want to be public as as:Public.

cwebber commented 5 years ago

Unfortunately, while I'm sympathetic to the problem, I don't think this is a good idea.

kaniini commented 5 years ago

I'm open to alternative solutions to the problem, but this is something that requires a mitigation. Bluntly, the to and cc fields are 100% "advisory policy," but they exist in ActivityPub anyway. What do you propose we do? Again, users need the ability to restrict their audience somehow. If we can't restrict audiences in a cross-implementation compatible way that has low implementation cost (which is precisely what as:Authenticated does), then we're going to continue to have users shooting themselves in the foot.

nightpool commented 5 years ago

the reason I like this better then some theoretical advisor flag X-I-Observe-Sharing-Levels that it ties scraping/activity to a specific activitypub actor that can be blocked/tracked/etc in the same way we already handle bot accounts. It brings a full class of behaviors into the network in a way that users are used to reasoning about.

However, i'll admit I'm not super sure what the UX for this should be from the Mastodon side of things—probably an account-wide checkbox, but people expect those to be retroactive, and activitypub payloads can't be.

On Mon, May 27, 2019 at 10:04 PM Christopher Lemmer Webber < notifications@github.com> wrote:

Unfortunately, while I'm sympathetic to the problem, I don't think this is a good idea.

  • As said earlier in the proposal, spinning up puppet accounts is too easy for this to be considered anti-abuse; it isn't.
  • This proposal brings us more to "advisory policy" type stuff than we've ever been before. If that's the case, why pretend that authentication is even involved? Just have an http header that's like X-I-Observe-Sharing-Levels: true, then add something like a sharingLevel that intentionally advises what level of sharing is allowed (this can be some enum whose values we can decide in this thread). Just don't serve the object to anyone who doesn't apply it, and it does "as much good" as authenticating would have anyway. This operates not too differently than Signal's "delete after X time" stuff or the (admittedly doomed) "do not track" http headers; on a protocol level, it can't be enforced, but it can ask well meaning participants to... participate well.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/activitypub/issues/339?email_source=notifications&email_token=AABZCV42TJOIXOENITFDXXDPXSHLNA5CNFSM4HP6T7E2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWKYLOA#issuecomment-496338360, or mute the thread https://github.com/notifications/unsubscribe-auth/AABZCV24DQMRE3KCSRJHVYDPXSHLNANCNFSM4HP6T7EQ .

kaniini commented 5 years ago

my suggestion on the mastodon side would be to make a new scope and set that as the default for users. it is possible that some users may truly want to post as:Public content once in a while.

trwnh commented 5 years ago

why pretend that authentication is even involved?

Isn't it already possible to require authentication as a server-side software policy? Sure, it's purely "advisory policy" if fetching continues to be possible while unauthenticated. But aren't we in a situation where it's possible to have these access controls while not having a way to advertise these access controls?

Bluntly, the to and cc fields are 100% "advisory policy"

This is probably complicated by the existence of sharedInbox but AFAIK direct delivery via POSTing to an actor's inbox absolutely is not advisory in the slightest. It's no technical guarantee as long as your actor is hosted on someone else's server, of course, but from a purely spec-level view, only the actor should be able to view their own inbox.

At least for the outbox, barring the possibility of forwarding a message, you should be able to advertise what your server will do with GET requests.

kaniini commented 5 years ago

why pretend that authentication is even involved?

Isn't it already possible to require authentication as a server-side software policy? Sure, it's purely "advisory policy" if fetching continues to be possible while unauthenticated. But aren't we in a situation where it's possible to have these access controls while not having a way to advertise these access controls?

Yes, it is possible, and as:Authenticated does not change this as it does not advertise that authenticated fetches are needed (unfortunately).

Bluntly, the to and cc fields are 100% "advisory policy"

This is probably complicated by the existence of sharedInbox but AFAIK direct delivery via POSTing to an actor's inbox absolutely is not advisory in the slightest. It's no technical guarantee as long as your actor is hosted on someone else's server, of course, but from a purely spec-level view, only the actor should be able to view their own inbox.

From a purely spec view, to and cc are supposed to be respected. They can choose not to, but that makes their implementation non-compliant. Same thing with as:Authenticated. So, sharedInbox or not, it's advisory policy.

But more important than that, 2000+ LGBTQ teens just had their personal lives put at risk by the current design of ActivityPub forcing self-leaking. We must respond with a mitigation of some kind for this problem so that people stop self-leaking their posts just to allow them to be boosted.

If we want to refer to security labels as "advisory policy," then we should drop as:Public, as it's equally unworthy.

kaniini commented 5 years ago

to be absolutely clear, I don't think anyone here is under any illusion that this isn't basically "as:Public, but please interpret me in a different way"

but our patient (the fediverse) is presently gushing blood all over the operating table and the heart monitor is going critical.

to be clear, OCAP is the solution we really need, but we need to stop this bleeding before we can even talk about how to make OCAP work. this is strictly about introducing a mitigation that allows users to stop self-leaking their posts to anybody who drops by, just so they can get boostable posts.

cwebber commented 5 years ago

the reason I like this better then some theoretical advisor flag X-I-Observe-Sharing-Levels that it ties scraping/activity to a specific activitypub actor that can be blocked/tracked/etc in the same way we already handle bot accounts. It brings a full class of behaviors into the network in a way that users are used to reasoning about.

But not one that's safe or effective. It also removes private reading from the system and adds a whole new surveillance tool. People should have the right to read without being observed (writing is a different matter). That's extremely absurd when we're saying "oh no, privacy is being violated (in a public setting), the right way to fix this is to add a giant privacy problem"

to be clear, OCAP is the solution we really need, but we need to stop this bleeding before we can even talk about how to make OCAP work. this is strictly about introducing a mitigation that allows users to stop self-leaking their posts to anybody who drops by, just so they can get boostable posts.

My suggestion of sharingLevel wasn't ocap based, in this case. I agree it's a band-aid, and that ocap is the long term solution, but I'm suggesting an alternate band-aid.

Maybe I wasn't understood. My suggestion would look like the following:

{"@type": "SomethingSomething",
 "content": "really great stuff here",
 "sharingLevel": "DoNotScrape",
 "to": ["https://www.w3.org/ns/activitystreams#Public"]}

Then, any http request that doesn't provide X-I-Observe-Sharing-Levels gets denied.

It sucks, but it provides the same "feature" that people who aren't currently in the know to this extension don't even get to see it, and people who are can, and people who want to abuse it... well they can, but they also could with authentication, but this time we didn't open a giant festering privacy problem.

nightpool commented 5 years ago

nobody is "self-leaking" posts. they're posting publicly. everyone on Twitter and Tumblr posts publicly all the time, and it hasn't even been an issue until today. I don't think your analysis of the outrage is correct. you're implying that a full cohort of users migrated from a platform where private posting isn't even a thing, and didn't expect to actually be posting publicly. that doesn't make any sense. I think it's much more likely that the source of the outrage is what people have said it is—a group of people overstepping unwritten fediverse norms due to their ideology.

I still think this is a reasonably good proposal, and I think it's good to have, but if cwebber isn't happy including it, I don't think there's anywhere near the type of urgency you're implying there is.

On Mon, May 27, 2019, 10:44 PM William Pitcock notifications@github.com wrote:

to be absolutely clear, I don't think anyone here is under any illusion that this is basically "as:Public, but please interpret me in a different way"

but our patient (the fediverse) is presently gushing blood all over the operating table and the heart monitor is going critical.

to be clear, OCAP is the solution we really need, but we need to stop this bleeding before we can even talk about how to make OCAP work. this is strictly about introducing a mitigation that allows users to stop self-leaking their posts to anybody who drops by, just so they can get boostable posts.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/activitypub/issues/339?email_source=notifications&email_token=AABZCV3EBHLYJONVRQY23JTPXSMAVA5CNFSM4HP6T7E2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWKZ62I#issuecomment-496344937, or mute the thread https://github.com/notifications/unsubscribe-auth/AABZCV4FZVKPWKHDWEWXSNTPXSMAVANCNFSM4HP6T7EQ .

nightpool commented 5 years ago

I'm not sure I exactly understand your comment about "private reading". As I see it, this wouldn't add any additional fetches where they weren't before? (the part about Announces feels somewhat trivial and honestly non-essential, since there's no way to Announce something to someone who isn't already an activitypub actor). so the privacy of reading is isomorphic to what it already was—just swapping a single actor for a single IP

cwebber commented 5 years ago

IP tracking isn't great, but people can use all sorts of things (including tor) to read, previously. There also wasn't any requirement previously that fetching be done by a server... it can also be done by clients. This would also break C2S stuff, I think.

But even worse, it adds what's-this-user-reading tracking to the system. Do we really want to add something akin to google analytics to the fediverse? Because I sure don't.

cwebber commented 5 years ago

Especially, do we want to add something that opens up that many problems when there's a not-authenticated option that's an equivalent level of effective+broken, without those problems?

kaniini commented 5 years ago

the problem with using sharingLevel is that it doesn't really fit into how recipients are calculated, which, at least in Pleroma, we calculate a recipient set and then filter using that.

yes, we could just change it in our IR, but it seems cleaner to just have a special label as we do with as:Public.

I would find as:UsersOnly or similar to be an acceptable label name.

but either way, we need the ability for users to reliably create Announceable posts that aren't part of their outbox collection.

nightpool commented 5 years ago

I'm still not sure I understand. how do you get from "only show this object to authenticated users" to "some sort of google analytics tracking" without making additional fetches?

for example, if mastodon implemented this, it probably wouldn't be using any individual user's actor to request the objects, we'd use a system wide Service, since the request would be made on behalf of all users the Announce was delivered to

nightpool commented 5 years ago

unless I'm missing something, there's nothing here saying the origin server has to be doing the authentication, which is maybe the sticking point?

kaniini commented 5 years ago

fwiw I left out discussion of instance-wide actors because I figured that would be worse than leaving it flexible. should I just say "use an instance-specific actor to sign your fetches"?

nightpool commented 5 years ago

but either way, we need the ability for users to reliably create Announceable posts that aren't part of their outbox collection.

for the record, this is not a problem mastodon shares. our only point of ambiguity is "who is allowed to boost a given post?". we've decided that we won't treat boosts of private posts as valid. we could reverse that decision tomorrow (allow all users to boost-by-reference) with minor technical work and it wouldn't change much about the network, just user expectations.

trwnh commented 5 years ago

if you use an instance-specific actor to do an authenticated fetch, then that still says nothing about what the instance will do with the content afterward. the instance could show the post in an unauthenticated public timeline (if it were not aware that it shouldn't)

cwebber commented 5 years ago

the problem with using sharingLevel is that it doesn't really fit into how recipients are calculated, which, at least in Pleroma, we calculate a recipient set and then filter using that.

yes, we could just change it in our IR, but it seems cleaner to just have a special label as we do with as:Public.

This kind of ties in with two of my big regrets about activitypub, which are two ways that we allowed the actor model to be broken:

a) to and cc shouldn't be being used in S2S for filtering delivery; they should be used for informing who to deliver to in C2S. The original model was, you posted to specific inboxes; targeting was very direct. Receiving-side filtering is pretty broken. Probably we should have said that we strip off to and cc before delivery, and instead have another property like audience be the "keep these people in the loop for replies" property. That should be its only purpose. The fact that it's preserved has clearly confused people about the actor model style delivery that the protocol was intended to represent. b) "but how does that work with sharedInbox?" sharedInbox never should have been changed at last minute to be a server-side filter-on-receive; we should have gone with the original proposal that the ~sharedInbox like thing allow a header with all recipients on the targeted server to be sent to; that way delivery is still intentional.

Regrets, I've had a few..

trwnh commented 5 years ago

stripping off to/cc is one thing, sharedinbox is another thing... i think it's clear how the spec should behave in direct addressing to defined actors or collections. the only real point of contention is what exactly is implied by as#public as it relates to delivery. since as#public is not a real actor, it has to mean something else, and that unfortunately has not been defined at all. what does it mean to have to/cc/audience of as#public? consider mastodon's use of to:public as "public" and cc:public as "unlisted". this is a completely arbitrary distinction that indicates something else is missing semantically.

i.e. there are competing ideas of what addressing/delivery to the public pseudo-collection entails exactly. software should not have any ambiguity in how it handles certain things; to be ambiguous is to be underspecified.


edit: what happens if we hypothetically want to add more scopes? public is all we have currently; authenticated/logged-in is what's being proposed in this issue; suppose we wanted to have friend-of-a-friend scope as facebook or google+ implemented? there really should be a way to advertise what kind of access is allowed, or at least what is intended by the authoring actor. i don't intend this to be an acl-vs-ocap debate, but rather a concern about the issue of auth{entication,orization} scope itself.

kaniini commented 5 years ago

if you use an instance-specific actor to do an authenticated fetch, then that still says nothing about what the instance will do with the content afterward. the instance could show the post in an unauthenticated public timeline (if it were not aware that it shouldn't)

The instance could do that either way, authenticated fetches just allow for more control over who gets the object verses unauthenticated fetches. Case in point: instance blocking leaks.

kaniini commented 5 years ago

and, yes, this is what i mean with as:Public. we already have something which behaves like a security label. so, lets just admit it's a security label, that it's advisory policy, and move on.

don't like as:Authenticated as a security label? okay, why not as:Users or as:People or whatever. as you admit, the cat's out of the bag.

cwebber commented 5 years ago

The problem isn't the label, it's the requirement for authentication, with no extra serious benefit.

What both proposals have in common: acknowledgement of an "advisory" policy in order to read the message.

What's different is only that one of them requires authentication. So what does authentication get you? Somehow I think the belief is that it will allow for abuse moderation? Except that doesn't make any sense. Let's consider a real-world scenario about it.

So can we drop authentication from this conversation?

npdoty commented 5 years ago

as:Authenticated vs. as:Public seems to speak to the audience/access of the post, but it seems like the archiving case isn't specifically about who has access to the post, but what they do with that access. I agree that it's an advisory label, and I don't think there's anything wrong with that if we're clear about the implications.

Could we have advisory labels that are about the subsequent use of the activity, rather than about the person? As has been noted, an archivist could easily become an authenticated fediverse actor, and they might honestly not know that as:Authenticated was meant to prevent them from reading or storing these posts. Instead, we could have explicit semantics that this isn't meant to be indexed and archived (noindex), which opts the user out of search engines that index or services that archive/backup or store/post in some other context.

On the Web, we have the Robot Exclusion Protocol (robots.txt) for that. I thought here it would have to be ActivityStreams vocabulary, but maybe not. We know that some archivists already choose to ignore robots.txt (and in this notable case all the scraping was done over the web, not through fediverse subscription/delivery, right?), and we know that some actors will choose to ignore any advisory label, but at least we'd be making the intent of the label clear.

cwebber commented 5 years ago

@npdoty I think your phrasing of it as something akin to robots.txt is spot on the nose.

npdoty commented 5 years ago

@nightpool

However, i'll admit I'm not super sure what the UX for this should be from the Mastodon side of things—probably an account-wide checkbox, but people expect those to be retroactive, and activitypub payloads can't be.

Mastodon already has an account-wide checkbox for "Opt-out of search engine indexing", which applies a meta-robots-noindex tag to account/post pages on the Web -- that's already a privacy feature that recognizes that some content will be publicly accessible but still have soft limitations indicated about its indexing/reuse. I'm imagining that we're talking about such a setting also being reflected in the ActivityPub, rather than a new audience/scope. But that wouldn't give the ability (as @kaniini suggests) of a user to case-by-case specify that a post is really-really-public or public-and-fine-to-scrape.

kaniini commented 5 years ago

Yes, the usecase is semi-public posts. What I encountered today was that a lot of users on berries were posting publicly because they did not have the option of posting a post that was shareable amongst their friends.

Let me be clear. I hate advisory policy. I absolutely hate it.

But we need a solution to allow people to share boostable posts amongst their friends without sharing it to the public at large. I don't care how we accomplish this per se, as we can just translate it to behave how we need it to behave in Pleroma. If that is something like as:NotReallyPublic or as:NotLoggedInIntent or whatever, I don't care. let's just pick a label already.

cjslep commented 5 years ago

I haven't had a chance to catch up on this lengthy thread yet, and I have a couple thoughts. I apologize if these points have already been made:

My preference is to take this as an opportunity to create a visibility property, which can be a structured and richer way of expressing the creator's intent (such as noarchive, noannounce, onlyauthenticated, etc). More advanced implementations can support this richer description as overriding as:Public whereas the old implementations would ignore visibility and just use that.

This would also allow users to do an Update to affect visibility without having to re-do entire delivery logic. go-fed is already a mess just trying to handle as:Public during delivery and I don't know how many implementations will understand if as:Public or as:Authenticated keep getting added/removed/moved to to or cc then there's all these complex side effects to consider. Much easier to leave the as:Public as one legacy dimension, and build out a property that is more expressive.

EDIT: Thanks again and sorry if I am rehashing already-made points.

wiktor-k commented 5 years ago

I'm imagining that we're talking about such a setting also being reflected in the ActivityPub, rather than a new audience/scope.

Just for the record it is possible to use X-Robots-Tag HTTP response header when fetching ActivityPub objects in JSON representation. This of course is not embedded in the object.

jonaharagon commented 5 years ago

On the Web, we have the Robot Exclusion Protocol (robots.txt) for that.

Again however, this is something that can and will be completely ignored. Like I said, none of the “solutions” mentioned in this thread will do anything about the issue this is supposedly supposed to prevent. There may be valid use-cases for as:Authenticated (or whatever, the actual name doesn’t actually matter in this discussion) but pretending that it will help with moderation doesn’t make sense because the fediverse is inherently open and allows any server to connect by default, which is a good thing.

Personally this seems like a user issue to me, I think that people just need to understand that once something is publicly or even semi-publicly posted anywhere there’s no taking it back.

npdoty commented 5 years ago

Would X-Robots-Tag: noarchive, noindex in responses for ActivityPub posts be sufficient? I think it's at least a good start, @wiktor-k and for Mastodon it would supplement the existing <meta> tag on the Web version of a toot.

But I think @kaniini was also looking for something in the object itself so that a server that receives such a post could perform some subsequent logic about limiting its display or distribution, like not rendering a Web version of it to a user that's not authenticated. (Servers could potentially keep track of whether there was an X-Robots-Tag response header when they requested the full post, but I'm not sure that would be attached on every delivery, and it complicates the storage requirements on AP servers.)

npdoty commented 5 years ago

@JonahAragon

On the Web, we have the Robot Exclusion Protocol (robots.txt) for that.

Again however, this is something that can and will be completely ignored. Like I said, none of the “solutions” mentioned in this thread will do anything about the issue this is supposedly supposed to prevent.

It can and will be ignored, but that doesn't mean it always is ignored. As far as I know, the Mastodon checkbox for opting out of search engine indexing is effective in that Google (and the Internet Archive, and several others) doesn't index, archive and provide cached versions of every one of your public posts, which it otherwise probably would. It will also take some work to convince other archivers not to ignore it for AP servers, but I think there may be some openness to respecting community and user preferences.

jonaharagon commented 5 years ago

It can and will be ignored, but that doesn't mean it always is ignored.

Of course, but my overall point is that if this is being added to protect the LGBTQ teen community or similar on Mastodon, a setting like this is going to instill a false sense of privacy when it is really adding a technical barrier that is incredibly trivial to overcome. IMHO this is more dangerous, as I can see it encouraging users to post more information they wouldn't want shared publicly.

cjslep commented 5 years ago

I wound up talking about this proposal more in the socialcg IRC (https://chat.indieweb.org/social/2019-05-28).

I like the general idea @kaniini is pushing for, but not this particular implementation.

I strongly prefer a property over a tag. Like a tag, it does nothing to change fundamentally the behavior of bad actors. Like a tag, it is merely an assertion of user intent. Which I think is important because it will help identify bad actors and therefore allow blacklisting. A totally separate solution is required to have the tech prevent the bad actors in the first place, and like kaniini has stated multiple times in different places, this would be most likely a breaking change to ActivityPub. But that is outside what I intend to discuss here.

I hate the tag idea in 'cc' and 'to' properties because it is ripe for semantic abuse:

cjslep[m] I don't view anything in AP as a UX problem, that's up to each app when interpreting AP. I view 'as: Public' as a Boolean signal that is abused (today, by Mastodon) to be a tri state signal -- and that third state is a semantic meaning problem. Mastodon has already dictated what 'to' and 'cc' mean for it's tri-states, but each other app could choose something completely different. And then from there a UX problem will arise: Masto shows one way, another app shows a different way. It is pure abuse of an unexpressive, Boolean state signal. Adding another Boolean signal that could be abused, doesn't sound good. In fact, it adds a new dimension of abuse, so you are getting 3^N new semantic meaning problems for N boolean-like signals

cjslep[m] Instead of seeing a magic string in an existing property and having to 1: recognize as a developer, that this magic string in this property 'to' vs 'cc' has special significance, and 2: as an app developer, buy into how everyone else is (ab)using this signal to mean the same thing. ...just have a property 'visibility' whose value can be one from a standard supported list of things, as well as custom add ons just for that developers particular app. No semantic meaning problem, so any UX problems then become tractable

Yes, these binary flags in delivery are (ab)used to have ulterior meaning for Mastodon today:

cjslep[m] If 'as:Public' is in the 'to' property it means "this is truly public and visible on all public timelines" and if it is on the 'cc' it means "this is public but not on the public timelines". These semantic meanings mean nothing to other apps that don't have the same timeline concept in them

This is why I am for an extensible, expressive property that can be added onto objects to signal both to good machines and good humans how the user intends to exercise their right to privacy. I don't want to talk about the bad actors because, again, there are more-fundamental technological questions. Also I don't want to talk about a user's backup option in legal recourse, for various reasons.

I like the property idea because it is also on the object itself, which means it'll be carried with the data and be a part of any integrity checksums and/or live right beside OCAP options. It follows all these good patterns.

This is why I strongly do not like the HTTP header idea. In addition to violating a lot of the good benefits outlined above (not on object, etc), ActivityPub doesn't necessarily always have to be tied to HTTP. There's been an idea of AP over scuttlebutt, for example. So I prefer keeping it as un-tied to HTTP as possible, to ease future growing pains.

jaywink commented 5 years ago

Completely agree with everything @cjslep said above. A visibility flag is IMHO the way to go to signify the level of visibility that the author intended. The tricky part is how to define the visibilities. People can't even agree what "public" means. I consider it meaning "open to anyone even bots" but other people get angry when they find out people who are not authenticated can read it. Just defining a bunch of visibility levels wont make the problem disappear. You still need to trust who you send content to to 1) interpret those levels the same way you do and 2) to respect them. Neither of them can be guaranteed.

Still, even though I initially liked this proposal, thinking about it again I'd -1 it on the basis that it's making the problem we have with as:Public worse. The fact that as:Public is used for audience targeting is a mistake in itself. You can't deliver to "everyone" but you can indicate with visibility your intension. Ideally a public object would have no to/cc at all - you just send it onwards and they do with it what they want.

wiktor-k commented 5 years ago

The fact that as:Public is used for audience targeting is a mistake in itself. You can't deliver to "everyone" but you can indicate with visibility your intension.

Ha, now that you mention it I had the same feeling when writing my own ActivityPub client/server software. Inferring visibility from the position of public in to/cc was, and still is for me, more complex than it should be.

kaniini commented 5 years ago

how about something like this:

{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {"DisallowGuests": "as:DisallowGuests",
     "Pinned": "as:Pinned"}
  ],
  "visibilityHint": ["DisallowGuests", "Pinned", ...]
}

this makes it clear that it's a visibility hint (which MAY be ignored in platforms where the hint is not applicable). that allows us to start working on the UX for this stuff while the details for OCAP get worked out. OCAP implementations would eventually enforce the visibilityHint by requiring capabilities to be acquired.

jaywink commented 5 years ago

Sounds good. So instead of defining a list of visibiltyHint values, anyone can define what they want and eventually implementations will start using a defined list out of convention.

ealgase commented 5 years ago

I may be totally misreading this, but doesn't this kill c2s clients, by requiring authentication from a server for fetching the objects?

trwnh commented 5 years ago

no, it requires authentication from an actor

On Fri, Jun 28, 2019 at 2:10 PM ealgase notifications@github.com wrote:

I may be totally misreading this, but doesn't this kill c2s clients, by requiring authentication from a server for fetching the objects?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/activitypub/issues/339?email_source=notifications&email_token=ACQ5OX6TCVU4GZFG2GBJHGTP4ZO2RA5CNFSM4HP6T7E2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY252MA#issuecomment-506846512, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQ5OXY5KFNWLAKAGMSUYRTP4ZO2RANCNFSM4HP6T7EQ .

ap-socialhub commented 1 year ago

This issue has been mentioned on SocialHub. There might be relevant details there:

https://socialhub.activitypub.rocks/t/fep-c118-content-licensing-support/2903/14

akuckartz commented 1 year ago

Another proposal was made by @kidehen@mastodon.social:

What about access controls whereby a user grants crawling privileges to a user, user-agent, service, or whatever grouping they seek to target?

I have crawling ability via a #REST #API or #SPARQL query that I protect using attribute-based access controls #ABAC right now. My default rule allows any authenticated identity (using a cocktails of authentication protocols) to perform said task.

This is also compatible with existing #Web opt-out. https://mastodon.social/@kidehen/109779431773812206

evanp commented 2 months ago

@trwnh has started a FEP on the topic: https://codeberg.org/fediverse/fep/src/branch/main/fep/7502/fep-7502.md I believe that's the best way to proceed. I'm going to mark this ticket Needs FEP, pending closure.