w3c / activitypub

http://w3c.github.io/activitypub/
Other
1.2k stars 77 forks source link

sharedInbox / siteInbox type endpoint (publicInbox, but not just for public posts) #242

Closed cwebber closed 7 years ago

cwebber commented 7 years ago

Currently, the most general way to post an activity in ActivityPub is to post it to a user's inbox endpoint. However, since well known figures with many subscribers would result in many posts to many users at once, we've made an exception for public posts, which may be posted to the publicInbox, which may be shared amongst users on a site.

On the call today, we found that this was not enough for Mastodon. On Mastodon, followers-only posts are common. Gargron gave an example that they have over 12k users, and should every followers-only post result in 12k HTTP requests given that many users are on shared servers?

Gargron suggested that Mastodon will probably reuse the publicInbox endpoint for this purpose. While I personally strongly prefer the delivery to inboxes approach, I think we need to address this. It's clear that Mastodon will do something to the effect in its implementation, so I think we need to get this right in ActivityPub itself, otherwise we could end up in the same space as what's happening in OStatus right now. One could easily see an implementation like Mastodon posting private content to the publicInbox endpoint and expecting servers to filter delivery based on content, and other servers not being aware and unintentionally delivering that information publicly to their users. That would be bad!

So, I think we should rename publicInbox to something like sharedInbox or siteInbox and change its behavior.

cwebber commented 7 years ago

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

puckipedia commented 7 years ago

I would vote for specifying each and every recipient, as servers may implement the addressing slightly differently, and a server can't feasibly look through a collection containing e.g. thousands of objects stored remotely. Alternately, in a perfect world, those 12 thousand followers would be separated over e.g. a thousand servers, which would limit the amount of recipients per server to way less than the total of twelve thousand.

And also, what if a user is mentioned in to and also in the follower collection?

clacke commented 7 years ago

Would it make sense to have this as two different features, publicInbox and sharedPrivateInbox? I'm thinking that maybe some implementation would only be bothered implementing publicInbox. But maybe that's adding complexity for little benefit. Personally I have to admit I was wondering why the shared endpoint was for public posts only.

puckipedia commented 7 years ago

I think if ActivityPub had an endpoint that sends objects to many different clients remote actors, a publicInbox endpoint would be unneeded, I guess? as it'd duplicate the functionality provided by the former

strugee commented 7 years ago

@puckipedia I can't quite parse that sentence? What now about clients?

Alternately, a sending server could, as a separate part of the message, specify an exact list of recipients relevant to that server. For 12k or even 1M followers, this could be a large post (though it could be done in multiple posts) but it would be less large than 12k individual HTTP POSTs. But it would be much more precise, and more respectful of things like blocklists (which currently we specify not federating across servers to protect users.)

So it appears to me that, ignoring efficiency, this is the better option. It's guaranteed to be precise (even if followers lists get out of sync), and it respects Blocks. So I think the million-dollar question here is, is this good enough? This is just an optimization, after all... if it's good enough I say ship the simplest thing.

Also, since I'm having a hard time keeping this abstract scenario in my head, lemme make sure my notion of the Block problem matches everyone else's:

  1. alice@example.com and chuck@example.com follow bob@foobar.net
  2. bob@foobar.net Blocks chuck@example.com; foobar.net's notion of bob@'s followers is updated but example.com's notion of bob@'s followers isn't because the Block isn't federated
  3. bob@foobar.net posts a note with to: Followers
  4. foobar.net distributes the note to alice@example.com but not chuck@example.com

The problem being that if example.com is responsible for inferring where the note should've been delivered, it'll get it wrong because it doesn't know about the Block (and the subsequent mutation of bob@example.com's followers list). Right?

puckipedia commented 7 years ago

woops, with clients I meant 'remote actors' (updated above comment to clarify). And indeed, that's the same notion of the problem I had at least.

Gargron commented 7 years ago

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

I am also in favour of this because I do not like the idea that publicInbox should ignore to/cc fields. The logic should be the same regardless of which endpoint is being delivered to, so renaming it to sharedInbox would be more semantic.

Speaking of which, it'd be nice to be able to target a Block activity to such a sharedInbox, without targeting the blocked user. That would feel more right than having to send a Block activity directly to the blocked user.

cwebber commented 7 years ago

I am strongly in favour of a sharedInbox endpoint that would respect audience targeting (such as "followers collection"). Listing individual recipients does not scale well. My decision is also largely informed by business logic implemented in Mastodon. The main way of getting a status on someone's home feed is them following the author. That means the huge list of individual targets would be pretty useless. Blocks federate in Mastodon, so that's not the issue.

This is definitely a tricky point here. We're definitely seeing decisions of the backend bleed into desires of the protocol. I guess maybe that's inevitable, but I think this is the first time we've seen it so clearly. We have effectively two approaches here:

I'm not trying to make a case for either in this post, I'm just trying to document the difference. Unfortunately, it's also pushing some pressure to make a decision in how we implement this, and whatever we do will probably affect the backends of these systems.

cwebber commented 7 years ago

Going back to this suggestion I made earlier:

If this is only used for delivery to followers, this endpoint can be simplified considerably. All other recipients could be handled by posting to individual inboxes.

I wonder if this could avoid the explicit vs implicit battle? Use this endpoint for followers only, and use explicit delivery for everything else? What do people think about that?

Gargron commented 7 years ago

email-style vs Twitter style

Another point I'd like to add is the distinction between inbox content which is otherwise not present in ActivityPub. In e-mail, your inbox is stuff people send you personally. In a social network, you have a home feed, which is things you subscribe to passively, and notifications, which is stuff sent to you personally. I don't think any of our current users would be happy about the prospect of anyone having the capacity to insert their post into their home feed.

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home). So followers URI must be handled imo.

cwebber commented 7 years ago

Even if we do adopt the implicit federated posts endpoint as a compromise, it does introduce the problem that it requires federating Block activities if Block activities are also meant to stop delivery to such a user as a follower, which we even have text in the spec as-is saying you shouldn't do, to protect users...

jaywink commented 7 years ago

Lots of good IRC discussion about this, this AFAICT the last mostly head nodding receiving suggestion:

23:35 | <cwebber2> | - we should switch publicInbox to sharedInbox; make it for addressing only to followers and individuals on to/cc
23:35 | <cwebber2> | *and*
23:35 | <cwebber2> | - we switch the Block section away from saying SHOULD NOT federate, and instead include an informative note explaining the tradeoffs to doing each

Relevant IRC logs at this point and before: https://chat.indieweb.org/social/2017-07-16#t1500237332450000

strugee commented 7 years ago

Link to the Etherpad from the Mumble call: https://public.etherpad-mozilla.org/p/activitypub-implicit-explit

Someone correct me if I'm wrong but I believe this was the consensus:

  1. sharedInbox is used only for public and followers delivery, everything else goes to individual inboxes
  2. ForceUnfollow (what it sounds like; probably done with {Undo: {Accept {Follow}}} or {Reject {Follow}}) is a separate concept from Ignore (preventing side effects, probably still done with {Block: {Actor}})
  3. Whether to perform ForceUnfollow at the same time as Ignore is left up to implementors, who can then use the "strawman proposal" algorithm at the bottom of the Etherpad. E.g. Mastodon will do this to match existing UI and thus will always use the sharedInbox endpoint.
  4. Need some way to communicate to clients whether or not the implementation does this, so clients can present accurate information to user (i.e. want to avoid a situation where client presents ForceUnfollow and Ignore as separate actions, but the server implicitly ForceUnfollows when the user Ignores). Could be a binary flag on the actor
clacke commented 7 years ago

If you use explicit delivery, there is no way to distinguish between a truly targeted post (notification-worthy) and a passive post to followers (home).

Oh! I wasn't paying attention and didn't notice that major and minor inboxes went away on the way to standardization.

But coupling major/minor addressing to delivery efficiency seems backwards.

Gargron commented 7 years ago

@clacke By major and minor addressing do you mean "to" vs "cc"? Because if so, that is still in the spec.

cwebber commented 7 years ago

@Gargron major and minor are "specialized" filtered read-only inboxes in pump.io. They're basically filters for your main timeline, filtered to have just the stuff you really want in it, vs the firehose of everything, including every follow and unfollow and like and delete that crosses your timeline.

@clacke major and minor probably won't go away, it'll just be moved to an extension; it's not necessary for the main protocol and is super underspecified as-is. Probably we'll see some different filtered inbox proposals come up in extension-land.

strugee commented 7 years ago

https://github.com/pump-io/pump.io/blob/master/API.md#major-and-minor-feeds for those who want as close as you can get to authoritative info on these feeds.

clacke commented 7 years ago

If to and cc is still in the spec I don't see that "there is no way to distinguish between a truly targeted post [ . . . ] and a passive post to followers". Delivery mechanism shouldn't affect that.

cwebber commented 7 years ago

From the meeting:

<eprodrom> PROPOSED: for https://github.com/w3c/activitypub/issues/242, group supports
  renaming publicInbox to sharedInbox and allowing sending to followers only IFF
  implementation support 

... so this is a TODO for me.

cwebber commented 7 years ago

sharedInbox is now added to the editor's draft. Additionally, the "IFF implementation support" is already in place, since Mastodon has added sharedInbox, as described in this document, to their implementation, and is ready to roll it out in their next release.

cwebber commented 7 years ago

I'm not closing this yet because we need help to add sharedInbox to the ActivityStreams vocabulary. I guess that means that publicInbox should be marked as deprecated as well on that document?

cwebber commented 7 years ago

Oh yeah, sharedInbox was added to the AS2 vocab/context, so we're good!