Pagination for Backend API

byhow commented 3 months ago

Description

Need to figure out a way in postgres to do bi-directional pagination (look for the exact page the items is at)

References

thematters/matters-web#4375

Story

https://github.com/thematters/developer-resource/issues/372

byhow commented 3 months ago

Motivation

As we would like to support bidirectionally navigating the collections for paginations, it is necessary to implement a new set of pagination instructions through the backend APIs, in particular:

new collection GraphQL query parameters
- and their corresponding query input/output types
collectionService changes that covers pagination logic to scroll back to previous pages
connections modules where the query result transformation has to take account of various new params
corresponding test suites and their mocks/pre-populated datasets

Existing Framework

The current pagination logic supports after which gives the following crucial location controls, i.e. - the PageInfo object in the paginated response from graphQL connections. This works well for the next graphQL query to get the next page with configs such as hasNextPage and endCursor.

Based on several calculations of after (which is the offset), first (which is the limit) and the NEW before which is an offset but to the previous pages. Ideally the code logic would works like this. Say if we have this list of articles in the collections:[1, 2, 3, 4]

Querying with { articleId: 2 } will result you [4, 3, 2, 1] with the default limit of 10
Querying with { articleId: 3, first: 2 } will result in [4, 3]
Querying with { articleId: 4, first: 2 } will still result in [4, 3] - this is because the articleId serves a anchor to where the page is, and once we have this page, we know what list of article titles to render in the frontend, and we can find() the index of that article fairly easily in the frontend as well
Querying with { before: indexToCursor(articleId2) } gives you [4, 3]
Querying with { before: indexToCursor(articleId1), first: 1 } gives you [2]
Querying with { before: indexToCursor(articleId2), first: 10 } gives you [4, 3]
Querying with { before: indexToCursor(articleId4), first: 10 } gives you []

Challenges

With a relatively comprehensive list rolling out, it is not hard to find out that this return logic based on very heavy heuristics. The expectation of this design was to be exhaustive on catching those abnormals, but here is where the challenge comes in.

There's two parts to it: querying with articleId and/or before. The past logic is with after and nothing else. It is through the same querying interface so it'd be easier to extend on the previous design, but here's the catch:

The previous logic to determine pagination and start/end was very procedural logic with few type annotations. The mutable flow of take and skip needs to take care of a lot of hidden cases that might need further investigation / refactoring to separate each case so that making a change to one from another won't have breaking changes to the previous APIs
ditto on connectionFromArray and fromConnectionArgs , both are pretty imperative
connection window: determining a useful way to determine startCursor and endCursor so that running the request through before can be valid and not error-prone
writing exhaustive tests (query, service) for catching new cases from the new params

Lastly, and probably the most important of all, after investigating how startCursor is determined, 0 indexed, having the first item in a returned list of document as the startCursor won't be achievable if we want to make queries to the previous pages. see this example:

if we query { articleId: 1, first: 1 }, the result array or edges will be [1]. Great! Now we have this to represent where startCursor and endCursor should locate, right? A bit different from what I expect it is that both points to 0 because this is the only and [0]th element in the list, therefore we cannot query things that's previous to zeros, which is always going to be determined by the index of it in the RETURNED collections/arrays instead of the actual cursor from the database.

Remaining Work

The overall framework is ready, but in order to refactor the existing logic to make it more resilient to future pagination logic changes, it might be in the best interest to make a new function besides maybe fromConnectionArgs or connectionFromArray to handle the before logic, so that the previous pagination flow with after can be intact.

Tuning the types and tests here can also be pretty heavy work since there's also a lot of mocking, pre-populating datasets, testing out array with different length, different combination of the parameters (2 new params and the previous ~5 critical params)

byhow commented 3 months ago

cross-referencing https://github.com/thematters/matters-server/pull/3994

thematters / matters-server