Open mstoecklein opened 1 year ago
These proposals have been extensively discussed here in the past.
I think @hoytech is working on something
I have thought a little bit before about cursors and offsets, but I'm not currently working on anything. For something like this I'd want it to be specified as a NIP before implementing it, and I'm not sure there's an appetite for that at this time. Paging by timestamp seems good enough for most clients.
Clients should either store everything locally so they can paginate or they should accept the chaos and that they will not always get everything in the most perfect pristine absolute order.
Clients should either store everything locally so they can paginate or they should accept the chaos and that they will not always get everything in the most perfect pristine absolute order.
They should do that, and many clients are actually doing so right now. However, when communicating with multiple relays and requesting the same set of data, it would be excellent to have the ability to specify the subset of data that you already possess.
The timestamp pagination is good but not enough. As @hoytech saying, thats definitely talk for another NIP
Having good pagination sounds like it would fix all my problems with the home feed rn.
Been wondering how best to do pagination for a while. Would honestly love a formal specification for this.
The way I'm currently doing it is with since
, until
, and limit
filters.
"limit": 20
. Get the events, then sort them descending by timestamp.until
value for your next filter is the created_at
timestamp of the oldest event in your collection.I'm doing a bunch of crazy Mastodon stuff in this repo which might not make sense, but nevertheless my pagination code is here: https://gitlab.com/soapbox-pub/ditto/-/blob/7c2de9b2cf72a61efdac1c8f19418b377308612d/src/utils/web.ts#L66-106
And then here's how I use it to create a paginated home feed: https://gitlab.com/soapbox-pub/ditto/-/blob/7c2de9b2cf72a61efdac1c8f19418b377308612d/src/controllers/api/timelines.ts#L12-17
It's still not perfect. I end up creating missing gaps sometimes. I'm planning to fix that by querying a few more events than I really need and chopping them off between steps 2 and 3.
Using an offset
filter doesn't make sense because new events could come in between the first call and the second. With offset
being independent of your application state, you'll end up getting the same events over and over.
Paginating by timestamp makes the most sense, but since the timestamp is low precision (second rather than ms), you might need to subtract 1 from it while paginating.
Something I find myself wanting a lot is an "event ID + timestamp" identifier for pagination. The dumb way would be like ${lpad(event.created_at, '0', 10)}${event.id}
, eg:
01693595340a2c2fe7c59c0d83d8a2ecd13f911550621177f389170a65c0cbe9ce4cb62de19
The first 10 characters are the timestamp, so just chop them off to get the event ID.
Benefits:
["e-cron", "a2c2fe7c59c0d83d8a2ecd13f911550621177f389170a65c0cbe9ce4cb62de19", "01693595340"]
I kind of want this to be a type of nip19 ID, but I don't think it's possible to preserve the "sorting alphabetically to sort chronologically" feature with the way nip19 encoding works.
Here's an easy way to do pagination:
until
being the smallest timestamp you saw from that relay for that query plus one.If you just started, do an initial query with no time constraints and a limit.
This doesn't work because some relay implementations (strfry?) don't return events in order.
The first version of strfry didn't, but it was a bug. Now it does. Some relays are still running old versions though.
It should be possible to create a
REQ
filter that can be used to paginate notes independently of the timestamp. I suggest anoffset
filter in conjunction with thelimit
filter:Example