Closed tigerbears closed 12 years ago
I also expect the since_id to work just as @tigerbears described above. Please return the next N items immediately following since_id (not the most recent N).
I was about to post the same.
Twitter also shares this issue. I blogged about it 2 years ago http://shkspr.mobi/blog/index.php/2010/06/twitter-api-pagination-and-ids/
It might be an idea for the API to return a HTTP 206 (Partial Result) in the case where you say since_id=1000&count=20 and the current head is 1100 (in which case you get items 1100 - 1080)
a HTTP 206 could tell the client that there is more data than it asked for and it could subsequently either kick off additional requests or display something in the UI
Another option is to have a different parameter entirely that has the behavior I described above, whether passed along with or used instead of since_id.
Avoiding more roundtrips would be good - imagine in Tony's scenario if the current head is 100000. A 206 response would be awesome if there are more viable posts available in that "direction," though. Perhaps with a header specifying that next post's id to make the next request more efficient? The same could apply to user lists when it's their turn to be paginated.
Twitter did solve this as far as I remember by returning a flag telling you the data isn't complete (or at least that theres more than you asked for) twitter return a dictionary as opposed to an array that app.net does. So looks like a header or HTTP response is the only way around this.
note I've actually built a paging system that works this way for a "live wall for events/places social network thing" since it was all app driven it was trivial to implement (using 206 responses etc). While server round trips are unfortunate, being clever about it can reduce the impact, and once streaming comes online it should just allow us to append data to the top of our tables without having to worry!
This hasn't been widely publicized yet, but one of our first migrations should solve this exact problem (https://github.com/appdotnet/api-spec/blob/master/migrations.md#current-migrations). We want to change the format of responses so they are always hashes instead of sometimes being lists. They will also have meta-information such as a more
flag. Thus, I don't think the pagination behavior will be changed (we'll always return from newest to oldest respecting the since/before bounds you give us) but this should solve the problem.
Once we get the documentation all updated, we'll probably make this response envelope the default for new apps next week.
Example:
curl -ik -H 'Authorization: BEARER ...' -H "X-ADN-Migration-Overrides: response_envelope=1" https://alpha-api.app.net/stream/0/users/me/mentions?count=4
{
"data": [
...the posts would be here...
],
"meta": {
"code": 200,
"max_id": "188825",
"min_id": "187020",
"more": true
}
}
(Note: It only took about two minutes to change my app back to only fill gaps in from newest-to-oldest, so please don't read this as bitching. :) The painkillers are kicking in so I'm not confident in how I've managed tone here. Nothing but love.)
I love that a response envelope is coming, that's great news. Big ups and thanks! It will be a bonus to not spin through received posts to know the bounds of what was received.
Unfortunately I don't think it solves the problem when needing to fetch from oldest to newest as-is. Fetching data gradually from newest to oldest works great now (and the "more" metadata will enable us to make that experience even better).
Imagine a scenario in which a user is viewing their mentions, participating in a lively conversation until the wee hours. They go to sleep and close their app. Upon waking, or perhaps a day or two later, they go back to the conversation and want to catch up. We can go ahead and grab the latest mentions no problem. Seeing that there's a gap in the conversation, also not a problem. Good to go. But if that user wants to catch up to that stream chronologically, we have to backfill in steps from the newest unseen post rather than from the oldest as some users will expect.
You'd have to be fairly popular to have so many mentions in that timeframe for there to be more than a set or two of "newest in this gap" posts, but now imagine that situation applied to one's personal stream, or to a view of a conversation thread. There it becomes a more significant problem. They may not want to spin through dozens of pages of posts to just catch the earliest stuff they missed.
For conversations / threads it becomes a pretty common problem. Stepping into a thread where you want to read it all from the beginning is going to be a common use case; probably even more so than on Twitter. I've happily noticed that conversations here are often far more robust than at that birdy place. Embracing and encouraging that is a good thing for the network as a whole.
When posts become more than text exchanges between people (insert chess game scenario here) this "directionality" will probably be even more important.
Anyway, I know the to-do list is colossal, and I'm thoroughly impressed with how quickly and reliably the API has grown. This is probably not as important as most of the immediate things coming, but it's important to address as soon as it can be done well.
If there's some technique I'm blind to, please do let me know.
Another use case of why oldest-to-newest might be useful is an app that lets me browse my history of posts and I wanted to jump straight to the first post I made. How can I get that first page of posts without the app requesting in sequence every single group of 200 posts I ever made? Huge waste of API calls, bandwidth, and time.
The problem is that app would have no idea what to set the before_id in order to get the older groups of posts without requesting every page.
Perhaps a solution would be to add an API endpoint that lets the application request an index given page size C (for count). The index would give it the max_id and min_id of every page of C posts. Then the app would be free to set the before_id and since_id to request the first page or a page in the middle if the app already has posts to a certain point.
Again respectfully, @mthurman I don't think this issue is closed.
Let's say I've scrolled way, way back in my timeline, 20 posts at a time.
I'm at "before_id" of 1500.
So, the posts I'm seeing are 1500, 1491, 1473, ... 1452, 1452, 1400.
I now want to jump one page forward see post since 1500.
I don't want to jump to the very latest posts (as that's what I get by not using since_id).
I want to see 1632, 1604 .... 1543, 1501.
But, of course, I've no way of knowing that the next set of posts start with 1632. So I can't call "before_id".
Since, to me, says "show me what has happened since this event, starting at this event". The current implementation says "Get posts up to this event".
That's what I think needs to change.
Ok, I'll reopen this issue. I'm not promising anything will change (and it may be a while before it's revisited), but we can revisit it.
Just to echo what @mthurman said, our response envelope does provide meta pagination data. We just updated the documents so have a look at that. That is the preferred mechanism to find out which IDs are present and whether more data is available (this comment is not a reply to the most recent comments on this issue, just extra clarification regarding pagination metadata).
I just wanted to echo the sentiment of needing the ability to retrieve posts from oldest to newest. I'm struggling with this very issue right now when trying to display threads/conversations starting with the first post. So, I have to work backwards in chunks of 200 from the newest post. What happens when there's a thread with 10,000 posts in it? Terrible performance and a lot of wasted bandwidth. This is a requirement, not a luxury IMO. I would suggest just adding an extra parameter to the API to retrieve posts "from_oldest".
You can now get the oldest posts by using a negative count parameter. since_id and before_id will still work the same.
Apologies if I'm missing something. The problem I'm trying to solve cleanly is basically "filling in" a gap of posts in a stream, where the gap is of a known size in terms of post ids, but not in terms of how many posts will actually be provided. (Think of the mentions or personal streams.) In this case, I'm filling in posts from oldest to newest.
If I make a request to, say, posts/stream and specify a since_id X and a count of N, the API will return the newest N posts that are later than X. What I'm looking for is the N posts after X.
Specifying a before_id in this case is possible (I know the gap in post ids) but often isn't appropriate since that gap can be quite large - and, as the posting rate increases, will continue to grow for any given amount of time between posts - and contains an indeterminate count of posts for that feed anyway.
Would it be appropriate and possible for, when requesting posts with a since_id and a count, for the endpoint to return the oldest N posts after that since_id? Or am I just missing something fundamental here?