w3c / activitypub

http://w3c.github.io/activitypub/
Other
1.16k stars 71 forks source link

Standardize outbox seeking method (way to retrieve posts published between other posts) #378

Open saschanaz opened 1 year ago

saschanaz commented 1 year ago

From https://github.com/mastodon/mastodon/issues/34#issuecomment-1616267547:

Per my understanding Mastodon supports outbox paging (which is required to fill the gap described in #34 (comment)), but AFAICT this is not a standard and implementations use their own parameter names for this. (Mastodon uses min/max_id while Misskey uses since/until_id.) Some implementations may not support such parameters at all, as seemingly Lemmy doesn't support either of such options. In other words, we don't have infra for this and we need to solve it first.

nightpool commented 1 year ago

I'm not sure I understand the issue here. Are the existing outbox paging mechanisms (next, prev, first, last, etc) insufficient? Like many specs, ActivityPub doesn't mandate how URLs have to be constructed, but it instead specifies how to find the next page, the previous page, etc to form a linked list.

On Sun, Jul 2, 2023, 7:34 AM Kagami Sascha Rosylight < @.***> wrote:

From mastodon/mastodon#34 (comment) https://github.com/mastodon/mastodon/issues/34#issuecomment-1616267547:

Per my understanding Mastodon supports outbox paging (which is required to fill the gap described in #34 (comment) https://github.com/mastodon/mastodon/issues/34#issuecomment-392873406), but AFAICT this is not a standard and implementations use their own parameter names for this. (Mastodon uses min/max_id https://github.com/mastodon/mastodon/blob/4fe2d7cb59f4622ff8af2f048b883f413e87c68e/app/controllers/activitypub/outboxes_controller.rb#L55-L61 while Misskey uses since/until_id https://github.com/misskey-dev/misskey/blob/a1327fa9e1329f2fb00d70b1e2332cea015bfdee/packages/backend/src/server/ActivityPubServerService.ts#L325.) Some implementations may not support such parameters at all, as seemingly Lemmy doesn't support either of such options. In other words, we don't have infra for this and we need to solve it first.

— Reply to this email directly, view it on GitHub https://github.com/w3c/activitypub/issues/378, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZCV5LKEZUNRUW4J47W33XOF2GRANCNFSM6AAAAAAZ3OP6V4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

saschanaz commented 1 year ago

How would you fill the gap with next/prev/first/last? Calling next again and again until you retrieve the target range is not ideal. (And can easily trigger an infinite loop if the remote server generates infinite pages)

nightpool commented 1 year ago

As Claire says, it's fundamentally impossible to know that you've received "every" toot a user posts. You'd have to have some property on the Activity level that says which posts came prior and then keep track of that every time you received an activity to tell if you're missing any posts. And that doesn't work at all for private / followers only / circles / group posts.

On Sun, Jul 2, 2023, 9:29 AM Kagami Sascha Rosylight < @.***> wrote:

How would you fill the gap with next/prev/first/last? Calling next again and again until you retrieve the target range is not ideal.

— Reply to this email directly, view it on GitHub https://github.com/w3c/activitypub/issues/378#issuecomment-1616703123, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABZCV4D43NGHVHGTQJWZ6DXOGHXJANCNFSM6AAAAAAZ3OP6V4 . You are receiving this because you commented.Message ID: @.***>

saschanaz commented 1 year ago

I don't think that works at all for anything, since any post can be deleted anytime. And it's not fundamentally impossible, the parameters are already implemented in Mastodon and Misskey in incompatible ways, we just need to somehow standardize it.

nightpool commented 1 year ago

I don't understand how or when servers would know to request those IDs, if there's no way to know a post is missing. Additionally, it would require all servers to use and expose lexically sortable IDs, which would be a huge breaking change for servers using e.g. slugs, UUIDs, etc

saschanaz commented 1 year ago

I don't understand how or when servers would know to request those IDs, if there's no way to know a post is missing.

Clients e.g. for Twitter had their ways to detect gaps, theoretically servers can have that too.

Additionally, it would require all servers to use and expose lexically sortable IDs, which would be a huge breaking change for servers using e.g. slugs, UUIDs, etc

Using time for that could be more compatible.

saschanaz commented 12 months ago

Random idea: add outboxAfter/outboxBefore properties on objects? No need to care about IDs in that case as servers can send whatever URL that fits their implementation.

evanp commented 9 months ago

There is a FEP (federation enhancement protocol) specifically for sorting and filtering collections like the outbox. It may make sense to move this conversation to that FEP in order guide its further development.

https://codeberg.org/fediverse/fep/src/branch/main/fep/5bf0/fep-5bf0.md

tesaguri commented 9 months ago

It seems to me that FEP-5bf0 is about a mechanism for servers to provide a set of views into collections predetermined by the server itself and describe the filter condition to clients, and not for clients (or remote servers) to request a view filtered by a condition dynamically specified by the client. The latter use case is explicitly mentioned as out of the scope of the proposal in the Security section

But yes, we could certainly borrow some concepts from the proposal. Maybe the cursored collection pages could be CollectionViews, for example?