Closed jmcarp closed 8 years ago
My gut says that we should be ok with the first option. I think when you're dealing with large lists of results, the best way to find what you're looking for is to use the filters, not to click to random pages.
But I'm doing a little investigation to see if there's any usability research that addresses this.
I would lean toward adding a unique key to sort. We already have 2 kinds of pagination, I don't think anyone has complained about that. I think that using page=
is much easier to work with.
Also, if we change all the pagination for most of our endpoints, I think that change should be versioned, because it would break most people's things.
In general, paginating by offset is a little easier to understand, but performs (potentially much) worse than paginating by last index. There are currently a few tables using offset pagination that really shouldn't be--some of the aggregates (occupation, employer) have millions of rows. But I'm not (strongly) opposed to keeping offset pagination for the smaller tables.
We should probably talk about versioning soon, since we intentionally haven't made strong promises around it and haven't started on implementation. I heard a rumor that @konklone might not think versioning is necessary--would be interested to hear more about that. That aside, I don't think this bug is the place to implement versioning.
I heard a rumor that @konklone might not think versioning is necessary--would be interested to hear more about that.
=) The relevant discussion is here: https://github.com/18F/api-standards/issues/5
However, I wasn't opposed to versioning in and of itself. I was arguing that versioning is premature optimization for v1 of an API, and that it's a decision better made at v2.
Definitely agree that it's possible to version prematurely. My main uncertainty is how we know when we get to v2--when we need to commit to version any breaking changes. If we change pagination parameters, does that push us to v2?
If I had built something, this would annoy me-- maybe I am too sensitive. I have been asking people to try the data and I think this API will be in good shape for the 2016 election.
Having API users is helpful to find bugs like these. I don't think we will have API users if we are not careful about introducing breaking changes. And again, maybe they are more tolerant than I think.
I would be ok, not versioning if we could give a warning. Could we use unique key in the sort rules for now, and then post something on our documentation about the coming change and email the google group? I know no one is on that group yet, but that would make a good precedent for people to join.
I tend to agree with @jmcarp that it is preferable to have the same style of pagination (seek / cursor) everywhere throughout the API for consistency, and those who wish to find more specific things should just be more specific about their filters, like @noahmanger said. But I do think, with @LindsayYoung, that it is important at this point to not suddenly break things for API users. It would be good to give them some warning, at least.
That said, I find myself wondering whether there's any way to achieve a hybrid model for the pagination parameters, or to simply start using cursor pagination, while warning the user that they are not getting the pagination results they may have expected if they are entering random page=
numbers.
What would the new pagination parameters look like?
Just discussed this in the meeting: let's try to get this done in this sprint if possible.
I am going to break this into two issues the symptom described in 18F/FEC#142 and unifying pagination.
As pointed out in https://github.com/18F/FEC/issues/142, some views (specifically, views paginated using offset and limit) can return duplicate results under some sort rules. This happens when we sort on a column that has ties, which, combined with offsets and limits, causes postgres to return results in non-deterministic order. That's bad, since users may see the same result more than once, and may never see other results.
Options:
For simplicity, I would lean toward using seek pagination for everything, but I'm curious to hear what others think: @arowla @noahmanger @LindsayYoung. Both options would be quick to implement, and I think we should get to this soon if possible.