w3c / activitystreams

Activity Streams 2.0
https://www.w3.org/TR/activitystreams-core/
Other
279 stars 61 forks source link

totalItems property could be approximate #429

Closed akihikodaki closed 1 year ago

akihikodaki commented 6 years ago

Please Indicate One:

Please Describe the Issue:

An ActivityPub implementation, Mastodon uses totalItems property of outbox to count number of notes. The view of outbox can be varied by the viewer. Audiences of some notes are limited and the view is also limited accordingly. However, totalItems is the count of all notes, including those excluded from the view. That is a technical limitation of the server; it is too expensive to count with such a limitation. It should be noted that totalItems could be approximate, and an extension to indicate precision of totalItems should be introduced if necessary.

gobengo commented 6 years ago

You might also just want to have two separate logical resources for 'all the notes' (/notes/) and 'all the notes that you can see' (/notes/?visibleBy=bengo). Don't specify totalItems on the latter if you can't compute it.

But even still, it would be nice for some sort of well-known vocabulary item to point to that separate Collection.

akihikodaki commented 6 years ago

Then ActivityPub would need an extension. Should I open an issue for ActivityPub as well?

akihikodaki commented 6 years ago

By the way, note that an implementation with incorrect totalItems is already released and being deployed to a Web service with 200k users. It would be more difficult to fix as time passes.

gobengo commented 6 years ago

Should I open an issue for ActivityPub as well?

IMO no. Create a new repo or gist or blog post describing your extension proposal.

evanp commented 6 years ago

totalItems is supposed to be the total number of items for the logical view you're using. In this case, the total number of items that the user can see.

https://www.w3.org/TR/activitystreams-vocabulary/#dfn-totalitems

"A non-negative integer specifying the total number of objects contained by the logical view of the collection. This number might not reflect the actual number of items serialized within the Collection object instance."

If it's too hard to calculate, maybe just leave it out. What's the value of a totalItems property that doesn't have the correct total number of items?

evanp commented 6 years ago

I'd also point out that this is a security issue. The existence of activities that cannot be read by the current principal is potentially sensitive information.

evanp commented 6 years ago

One last question: I'm having a hard time coming up with a case where the count of items in the logical view is larger than the "real" count. My only idea is if a logical view shows more than one item for each underlying item in the collection -- say, splitting up nameMap elements and making separate objects for each. Seems tricky though!

evanp commented 6 years ago

So, I think the solution here is that Mastodon shouldn't show totalItems if it can't actually calculate the totalItems visible to the principal.

evanp commented 1 year ago

I added a page in the AS2 Primer on the W3C wiki to cover this topic.

https://www.w3.org/wiki/Activity_Streams/Primer/totalItems_in_Collection

In practice, consumers should assume that totalItems is an upper bound on the number of items in the collection.

Producers of collections should not include this number if they can't calculate it correctly, since it leaks sensitive information about the collection.