Tracking read state or position

aaronpk commented 6 years ago

Some people would like to be able to track the read/unread state of individual items in a feed. Traditional feed readers typically work this way, although more modern interfaces like Slack keep only a single read pointer pointing to where in the stream was last read.

aaronpk commented 6 years ago

In case you want opinions, individual state is crucial in a reader for me: for feeds with long-form content I often read individual items, not in order. The API could provide bulk methods for mark-as-read to support clients using the different style (maybe based on the timeline/paging mechanism?) www.svenknebel.de 04:13, 4 December 2017 (PST)

aaronpk commented 6 years ago

I too rely heavily on per-item read state. I think this comes down to a per-feed preference. This is because, the way I see it, there are 2 types of feeds I can subscribe to:

A stream I am interested in. Like a person’s https://indieweb.org/Twitter (or other micro-blogging) timeline. I am not particularly interested in every separate note on this feed, it is the feed (or rather the person!) I am interested in. If I am away from my reader for a weekend, I am not likely to go back and read what I missed. I just want to see their thoughts when they share them. A single “read pointer” is enough for these.
A collections of long-form posts, most likely articles, that I am interested in. Here my interest goes out to the separate posts. The feed in its entirety is of no real concern, that’s just a ways for a publisher to communicate their collection to me. Because I see each post as a singular entity, that is how I would expect their read state to work. Marking everything as read prior to a single point makes little sense, especially when the only commonality between the posts is that they happened to come from the same publisher.

In the chat it was raised that this is “treating it like email”. I would say instead that I treat it like a magazine subscription. I do not always want to read from cover to cover. But just because I happened to read the one in the back first, doesn’t mean I skipped everything before it because I didn’t want to read it. I merely skimmed the titles and will then read ''in order of relevance'', where relevance is subjective. This might mean marking some things as read right after skimming the title (irrelevant) and keeping other things as unread ever after the next batch of posts (magazine issue) has already come and gone.

Some people have used things like “star” in feed readers to get around this problem. But I think that is a bad usage of the feature. Read state should be used to remind me that I haven’t read a thing yet, not bookmarking.

— martijnvdven

aaronpk commented 6 years ago

I'm implementing a draft of this in Aperture right now. Here is the current API.

Every entry now includes a unique system ID, meant for internal identification of the item (not global identification). This is returned in the timeline response as the parameter _id, and there is now also _is_read. For example:

{
  "items": [
    {
      "type": "entry",
      "url": "http://example.com/100",
      ...
      "_id": "41003",
      "_is_read": false
  ]
}

These new _id values are meant to be opaque to clients, and must always be a string. Some servers will likely use integer database IDs, but other servers may use other string identifiers for entries depending on the implementation.

Retrieving the list of channels now also includes the number of unread entries in the channel:

{
  "channels": [
    {
      "uid": "notifications",
      "name": "Notifications",
      "unread": 0
    },
    {
      "uid": "YPGiUrZjNM36LNdpFy7eSzJE7o2aK82z",
      "name": "IndieWeb",
      "unread": 7
    }
  ]
}

To mark an individual entry as read:

action=timeline
channel=example
method=mark_read
entry=1234

To mark multiple entires as read:

action=timeline
channel=example
method=mark_read
entry[]=1234
entry[]=5678

Both of the above also work with method=mark_unread.

To mark an entry read as well as everything before it:

action=timeline
channel=example
method=mark_read
last_read_entry=1234

This is to address the use case of streams, where you really only care about knowing where in the stream you've scrolled to and whether there are any new entries since then.

This is mostly inspired by the Feedly Markers API Mark one or more articles as read and Mark a feed as read

EdwardHinkle commented 6 years ago

Not that it has to be implemented right now, but I do want to make a case for the “updated” field of a channel. In order to reduce “high noise signal”, for most of my channels, I’ll want the channel’s “unread indicator” to disappear when I reach the top of the timeline (even if things are unread). When a channel is updated (receives new posts), I would want to be able to re-enable the unread indicator. Essentially saying “there are new posts here” rather than saying “there are unread posts here”. In fact now that I say it, I might make the indicator a different color as well. That said, the purpose of such a channel is, I want to be able to know what I have and haven’t read, while only being prompted to open the channel if there are new posts. The “new posts” indicator essentially upping the priority of time looking at that channel than one without new posts. That said, when I have more time, to be able to go back to an existing channel and still know what I haven’t read (which is why this can’t just use the “last_read_entry”, even though that is a useful method).

(Originally published at: https://eddiehinkle.com/2018/02/12/7/reply/)

aaronpk commented 6 years ago

Now that you mention it, I think Facebook's notifications work this way.

They show a little number over the notification icon with the number of new notifications. However, they also track whether you've clicked on individual notifications separately. If you click the icon with the number, then the number is cleared and won't show again until there is new content. If you click on an individual notification, then the next time you drop down the notification bar that notification will be white instead of blue.

aaronpk commented 6 years ago

Here's a question. Do you imagine this additional state being something that only individual clients are aware of, or should that be synced to the server as well?

If the server returns the "updated" date, then the client has enough information to show the indicator itself. But as far as other clients are concerned, they wouldn't know about whether you've seen those posts in another client.

I'm kind of leaning towards it being a client-only thing, at least for now.

If that's going to end up getting pushed to the server then I think we need to better define the different kinds of states. Maybe "read" vs "seen", where "seen" is the soft indicator that the client has displayed the post to the user, and "read" means they've opened it up (or maybe even explicitly marked it as read).

aaronpk commented 6 years ago

I should add that the _id values are meant to be an identifier for this instance of the entry in the channel, not an identifier for the entry across channels. This means if you're following the same feed in multiple channels, entries may have a different _id.

EdwardHinkle commented 6 years ago

In my opinion I think that is a client-only thing. Different clients might have different read/unread formats, based on their creators. Of course we could change our opinion in the future but for now, I think the updated date just helps clients to do their own unique stuff.

(Originally published at: https://eddiehinkle.com/2018/02/12/10/reply/)

grantcodes commented 6 years ago

Working on implementing this in together now, I'm sure there is a valid reason, but I'm just wondering why the _id is different from the paging before and after values, could they not be the same?

aaronpk commented 6 years ago

The before and after values are meant to represent pages of data, not necessarily individual records. In my case, the after value refers to an item that isn't in the current page. I could return a string for _id that looks more like the before and after strings, but that's just an implementation detail of my server. Alternately I may switch my before/after strings to look more like the current _id value. Either way, this difference doesn't seem important to the client.

EdwardHinkle commented 6 years ago

What are the return values for success and/or failure of this operation? Or 200, 202 for Success?

aaronpk commented 6 years ago

Unless you can think of something to do with the response body, I would leave that undefined in the spec.

A successful response should be HTTP 200 or 202, anything else is a failure. Most often you would see an HTTP 400 for a failure such as if a channel ID is not found.

EdwardHinkle commented 6 years ago

Perfect. That's what I assumed, but just wanted to make sure

EdwardHinkle commented 6 years ago

Would

action=timeline
channel=example
method=mark_read
entry[]=1234

be valid? In the Swift struct I'm using to generate the POST request, I'm sure I could get it to just send entry= if there is only one entry, but if it's not required, it's probably just easier to leave it as entry[]=

aaronpk commented 6 years ago

That will work with Aperture right now, but let's make this explicitly allowed in the spec as well.

manton commented 6 years ago

I haven't been following this, but I think it's worth checking on Tweet Marker (https://github.com/manton/tweetmarker) and App.net's Stream Marker (https://gist.github.com/mthurman/4062406) as well, since those APIs have been widely used for this sort of thing. No need to copy them, but just to make sure nothing important is missing.

aaronpk commented 6 years ago

I pushed an update to Aperture which allows you to toggle per-channel whether read state tracking is enabled. There are two modes, one where it returns the count of the number of unread items, and the other where it returns only true or false depending on whether there are new items.

read state settings

For my super busy feeds, it wasn't useful having the counts, but I do like a subtle indicator there are new posts.

There are also some channels I don't want to be bothered about at all, so I've disabled read state tracking on those.

This means the Microsub API is now returning either an integer or a boolean for the unread property on channels, e.g.:

{
  "channels": [
    {
      "uid": "notifications",
      "name": "Notifications",
      "unread": 0
    },
    {
      "uid": "31eccfe322d6c48c50dea2c84efc74ff",
      "name": "IndieWeb"
      "unread": true
    }
  ]
}

jameysharp commented 6 years ago

I'm writing an essay on what technology is missing for decentralized publishing of serial content (webcomics, fanfiction, podcasts, etc) and read-state synchronization is one of my topics. I'm hoping Microsub can be one good answer in that context.

For the kinds of serials that have ongoing plot/continuity, it's worth noting that

entry order matters, and
creators sometimes add or remove entries arbitrarily far back in the archive.

As a result, individual entry read-marks matter here because you'd like to know that there's something new even though it comes before entries you've already read.

Also: old entries can be edited. You may want to know that you read an older version of the entry so you can decide whether you care to read the updates, so I'd like to keep track of what the last-modified timestamp was for the version of the entry that you read. Note that this is not the same as the timestamp when you read the entry, because you might be racing with the publisher's updates.

I could picture it also being useful to record the timestamp when you read each entry for some kind of data analysis later—for example, predicting which feeds you like best, or clustering feeds that you like to read in close proximity to each other. But I don't think that's important for a first version of this spec.

aaronpk commented 6 years ago

Interesting use case!

I feel like it's worth pointing out that the spec only needs to describe as much as is required for interop between clients and servers, and servers are free to do as many additional fancy things as they want.

I could picture it also being useful to record the timestamp when you read each entry for some kind of data analysis later

Sounds like it might not hurt to also have the client send a timestamp of when the item was marked read, in case the client is syncing a bunch of these read-marks after being offline. Of course servers are free to ignore that if they don't support tracking the timestamp of reads.

creators sometimes add or remove entries arbitrarily far back in the archive

Issue #24 could help in this case, being able to retrieve only the unread items in a timeline.

aaronpk commented 6 years ago

This has been implemented in Aperture, Monocle, Together, and Indigenous, and documented at https://indieweb.org/Microsub-spec#Mark_Entries_Read so I'm going to close this issue. Let's open new issues for any future discussion about the specific behaviors within read-state tracking.

grantcodes commented 6 years ago

Not exactly an issue but why should the server have the option to only return true / false for the unread count on a channel? Could it not be selected on the client and just use if > 0 logic?

I suppose maybe some servers will only be able to return true / false which makes it easier to build a server...

aaronpk commented 6 years ago

In Aperture, I wanted the option to choose whether to return boolean or counts (or disable it completely) per channel. This way the client can just react to the data it has.

indieweb / microsub

Tracking read state or position #4