w3c / activitystreams

Activity Streams 2.0
https://www.w3.org/TR/activitystreams-core/
Other
278 stars 62 forks source link

How can I best indicate how an OrderedCollection was ordered? #484

Closed aschrijver closed 4 months ago

aschrijver commented 6 years ago

Please Indicate One:

Please Describe the Issue:

I interpret 'Ordered' to mean 'Sorted', so it is not just [1, 2, 3, ..], but could be sorted alphabetically, sorted by published or updated property, etc.

When looking at the examples of OrderedCollection in both ActivityStreams-core and ActivityStreams-vocabulary, I find different kinds of ordering being implicitly used. When retrieving an OrderedCollection for the client the ordering that was used is also implicit knowledge, I deduct.

But what would be the best practice to indicate the ordering that was used in a certain OrderedCollection?

nightpool commented 6 years ago

Ordering a same collection of posts in a different way would result in a different OrderedCollection, to my thinking, and thus a different id.

aschrijver commented 6 years ago

Yes, I see from your responses that I should think of an id a bit differently.. I was thinking more like database ID's, while any dereferencable representation in the URI scheme of the REST API of the implementation are in fact the unique Id's.

So any filter or different sort order of a collection will lead to it having different id, etc.

jhulten commented 5 years ago

Still getting my feet wet in this space. Does a query string count?

cjslep commented 5 years ago

Does a query string count?

As an id? No, the id must satisfy the constraints in the JSON-LD spec

nightpool commented 5 years ago

IRIs aren't allowed to contain query strings?

On Mon, Dec 31, 2018, 10:57 AM Cory J Slep notifications@github.com wrote:

Does a query string count?

As an id? No, the id must satisfy the constraints in the JSON-LD spec https://www.w3.org/TR/json-ld/#node-identifiers

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/w3c/activitystreams/issues/484#issuecomment-450668462, or mute the thread https://github.com/notifications/unsubscribe-auth/AAORVzeeqnWylfBA63h49JiaY96f3HHwks5u-kHmgaJpZM4VCeCu .

cjslep commented 5 years ago

I interpreted "query string" in the context of "database" discussion to mean "SQL Query in string form", but a URL with query parameters SGTM.

akuckartz commented 5 years ago

Please do not force people to use query parameters if they want to implement the standard.

cjslep commented 5 years ago

No one in this thread is forcing another person to use query parameters in their implementation.

jhulten commented 5 years ago

So as an example, https://github.com/w3c/activitystreams/blob/master/test/core-ex22-jsonld.json

The document does not contain an @id and the format contains no information about the type of ordering applied. If cached, is the source URL required to make sense of this object? Is the default expectation that OrderedCollections are sorted by create time?

nightpool commented 5 years ago

there is no universal ordering to OrderedCollections in ActivityStreams—the "Ordered" means that there is some order, but the spec is agnostic as to what the order may be. for example, in ActivityPub, the inbox relation is an OrderedCollection that is ordered by when the items were received by the server.

it doesn't make any sense to talk about an OrderedCollection "on it's own" without some context as to where that collection is used.

jhulten commented 5 years ago

So is this "necessary, but insufficient" as designed? The lack of source context, order key, and order direction means that a consuming service must know engineer intent and opens the door for multiple implementation decisions that have to be known by the consumer.

cjslep commented 5 years ago

Yes. In my opinion, one could view ActivityPub as a transport specification to physically federate bytes from A to B. It also happens to have a tiny social-media-domain application specification on top of it, but bundled in the original specification. It makes having this view of ActivityPub difficult to separate and easy to criticize. There's a lot of discussion in the community right now on how to handle extending ActivityPub into other domains. Some are going ahead and doing it. It's all very disorganized right now since there's a lot of interested people examining the giant surface area that ActivityPub has unlocked.

nightpool commented 5 years ago

@cjslep this isn't the ActivityPub repo, so I have no idea what you're talking about.

@jhulten If you follow a link to the "posts" OrderedCollection, for example, then you know that it was ordered as that user chooses to order their posts. Depending on what kind of software the user is running, this could be completely arbitrary (picture a drag and drop interface) or reverse chronological (think twitter/facebook), or in order of popularity (think YouTube). There's no reason that a client application needs to know how the posts were ordered in order to display them—for the same reason that there's no reason a web browser needs to know why you happened to put the divs in that order before it displays a page.

jhulten commented 5 years ago

So the difference from a consumer perspective between "Collection" and "OrderedCollection" is that the consumer should not alter the order themselves. The "transport" specification then is responsible for deciding if they want to show the ordering by either including an additional namespace in their context, or specifying the ordering metadata in the identifier in some fashion.

From an ActivityStreams perspective, the only thing we care about is that the consumer does not alter the order. Correct?

nightpool commented 5 years ago

@jhulten right. one way of thinking about it is that OrderedCollections are arrays, and Collections are sets. one has a significant ordering, and the other does not.

cjslep commented 5 years ago

But do note that Collections are not sets: they could contain duplicate values.

pietercolpaert commented 4 years ago

I just found out activity streams does paged collections differently than Hydra, so sorry I’m late to the party ;-) I have a proposal for describing how a paged collection is ordered over here: https://github.com/HydraCG/Specifications/issues/172

evanp commented 9 months ago

As commenters have mentioned, there is no way built into AS2 to specify the ordering. As seen in the ActivityPub specification, one way to do this is in accompanying documentation. However, that is not sufficient for using with ad hoc collections.

FEP 5bf0 covers sorting and filtering with a CollectionView type. Servers can add a filter or sort property that defines how the collection view was filtered or sorted. It does not give the client a way to arbitrarily filter or sort a view.

CollectionView is specified as a sub-class of OrderedCollection, but it's not clear to me if it's OK to apply the sort and filter to an OrderedCollection. I think the best possibility at this point would be to use multiple types, [CollectionView, OrderedCollection] and use the sort property to show your sorting.

pietercolpaert commented 9 months ago

During this time I’ve been designing the TREE hypermedia specification as part of the W3C TREE Hypermedia Specification community group. It originates from, among others, this discussion on how a client could automatically do something smarter based on the pagination.

We observed that most paginations are just linked or double linked lists, with sometimes a way to indicate the ordering, and/or even IRI templates with search forms or controls to change the ordering. This is not the most efficient data structure to retrieve information, as it doesn’t allow smart selection of “branches” or “fragments of the collection” of data a client could be interested in. As an alternative, we propose to not paginate in a linked list, but to paginate according to a search tree. This can be done by not saying something is a “next” page, but by having a slightly more elaborate explanation of a relation from one page to another, and thus by allowing multiple more elaborate tree:Relation objects per page.

This is an example of how to describe such relation:

> HTTP GET https://example.org/Node1

ex:Collection1 a tree:Collection;
            tree:view ex:Node1 ;
            tree:member ex:Activity1, ex:Activity2 .

ex:Node1 a tree:Node ;
        tree:relation ex:R1,ex:R2 .

ex:R1 a tree:LessThanRelation ; # This is very useful for a client that is looking for a value 10 or greater
    tree:node ex:Node3 ; # This is the URL of another page
    tree:value "2023-11-11T23:00:00Z" ;
    tree:remainingItems 7 ;
    tree:path as:published .

ex:R2 a tree:GreaterThanOrEqualToRelation ;
    tree:node ex:Node4 ; # This is the URL of another page
    tree:value "2023-11-16T23:00:00Z" ;
    tree:remainingItems 10 ;
    tree:path as:published .

ex:Activity1 a as:Create ;
            ...
       .

ex:Activity2 a as:Update ;
          ...
          .

In this way, we can achieve an ordered dataset, in the way we want on the client, by just traversing the collection of items in a certain direction.

kkostov commented 9 months ago

I think that's an excellent suggestion @pietercolpaert. By conforming to tree, Relations and even Search forms could offer a solution for the challenge at hand and more.

When paginating LDES streams, we face similar challenges as some implementations of ActivityStreams (e.g. traversing user inboxes on the fediverse which can be huge and may require the collection to be ordered or filtered differently, depending on use cases which tree provides for.

evanp commented 5 months ago

@aschrijver we're waiting on your feedback before closing this issue.

aschrijver commented 5 months ago

Thanks for reminding me, Evan. It is 5.5 years ago since I filed this issue, and this is not a current concern for me anymore, nor can I gauge the needs of current implementations. If this is closed it is fine by me, and then FEP-5bf0: Collection sorting and filtering is the recommended approach, and also where @pietercolpaert and @kkostov best leave their feedback.