Let {edges} be the result of calling {ApplyCursorsToEdges(allEdges, before, after)}.
If {first} is set:
If {first} is less than 0:
Throw an error.
If {edges} has length greater than than {first}:
Slice {edges} to be of length {first} by removing edges from the end of {edges}.
If {last} is set:
If {last} is less than 0:
Throw an error.
If {edges} has length greater than than {last}:
Slice {edges} to be of length {last} by removing edges from the start of {edges}.
Return {edges}.
ApplyCursorsToEdges(allEdges, before, after):
Initialize {edges} to be {allEdges}.
If {after} is set:
Let {afterEdge} be the edge in {edges} whose {cursor} is equal to the {after} argument.
If {afterEdge} exists:
Remove all elements of {edges} before and including {afterEdge}.
If {before} is set:
Let {beforeEdge} be the edge in {edges} whose {cursor} is equal to the {before} argument.
If {beforeEdge} exists:
Remove all elements of {edges} after and including {beforeEdge}.
Return {edges}.
I feel it completely defeats the performance and safety purpose of exposing a connection over a regular GraphQL list. My main worry is it takes more effort (and knowledge) to construct a query that does the right thing (ie setting first/ last) than the wrong thing (ie not passing any argument). It feels this behavior is at odds with Facebook usual design philosophy of having developers "fall in the pit of success". In this case, the most natural / easy thing to do for clients is to request a potentially giant list of nodes at once.
Since most server-side libraries that implement the connection spec try to deliberately stick to the specs, it means that most Relay-compliant servers expose this dangerous and confusing behavior without being aware of it. As a result at my work we had to fork Graphene, the main GraphQL implementation in Python, to prevent GraphQL new comers to bypass pagination by accident.
Similarly, major real-world servers implementing the Relay spec had to purposefully break this spec in order to prevent users from issuing very expensive requests. For example, Github's API returns a MISSING_PAGINATION_BOUNDARIES with the the message "You must provide a 'first' or 'last' value to properly paginate the '<connection_name>' connection."
It's not the role of the Relay spec to specify how servers should protect themselves from expensive / abusive queries but we should not have to break the Relay spec to do so. My hunch is that this behavior was an oversight because the spec never explicitly mentions it. For example, the hasPreviousPage and hasNextPagealgorithms imply that there are only 2 types of queries: the ones "paginating with last / before" and the ones paginating with first / after.
Assuming it's indeed not deliberate design decision, I see 2 possible ways to enhance the pagination algorithm spec:
A. If {first} is not set and {last} is not set, return an empty list.
While it makes sense from an API design perspective (ie "return 0 edge when none are requested"), it has a few drawbacks:
it makes the spec more complex because we also have to specify the behavior of the PageInfo fields in that case.
it makes the API less self-discoverable in the sense that it is not obvious to a new comer why a connection did not return any edges
B. If {first} is not set and {last} is not set, throw an error.
That's my favorite approach because it addresses all the issues of A. and it is a very straightforward change to implement in servers and libraries. And my guess is that it's the behavior a lot of real-world Relay-compliant servers already follow (eg Github API) .
I understand that changing the spec should not be taken lightly and that's in everyone's interest that the specs remain stable as much as possible. That said, I think it's worth making that minor change because:
it will not impact the Relay ecosystem stability since Relay JS never relied on it
it will go a long way in helping the community builds servers / queries that are performant and safe by default
If you are reluctant to making change B. (or A.), can we at least include a note that says that the pagination behavior is not well specified when neither first nor last is set? This would be similar the note about when both first and last are included and it would make it easier for libraries and servers to justify deviating lightly from this implicit behavior.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi everyone,
I just realized the pagination algorithm spec implicitly says that if neither
first
norlast
is set, it should return ALL the edges in the connection:I feel it completely defeats the performance and safety purpose of exposing a connection over a regular GraphQL list. My main worry is it takes more effort (and knowledge) to construct a query that does the right thing (ie setting
first
/last
) than the wrong thing (ie not passing any argument). It feels this behavior is at odds with Facebook usual design philosophy of having developers "fall in the pit of success". In this case, the most natural / easy thing to do for clients is to request a potentially giant list of nodes at once.Since most server-side libraries that implement the connection spec try to deliberately stick to the specs, it means that most Relay-compliant servers expose this dangerous and confusing behavior without being aware of it. As a result at my work we had to fork Graphene, the main GraphQL implementation in Python, to prevent GraphQL new comers to bypass pagination by accident.
Similarly, major real-world servers implementing the Relay spec had to purposefully break this spec in order to prevent users from issuing very expensive requests. For example, Github's API returns a
MISSING_PAGINATION_BOUNDARIES
with the the message"You must provide a 'first' or 'last' value to properly paginate the '<connection_name>' connection."
It's not the role of the Relay spec to specify how servers should protect themselves from expensive / abusive queries but we should not have to break the Relay spec to do so. My hunch is that this behavior was an oversight because the spec never explicitly mentions it. For example, the
hasPreviousPage
andhasNextPage
algorithms imply that there are only 2 types of queries: the ones "paginating withlast
/before
" and the ones paginating withfirst
/after
.Assuming it's indeed not deliberate design decision, I see 2 possible ways to enhance the pagination algorithm spec:
A. If {first} is not set and {last} is not set, return an empty list.
While it makes sense from an API design perspective (ie "return 0 edge when none are requested"), it has a few drawbacks:
PageInfo
fields in that case.B. If {first} is not set and {last} is not set, throw an error.
That's my favorite approach because it addresses all the issues of A. and it is a very straightforward change to implement in servers and libraries. And my guess is that it's the behavior a lot of real-world Relay-compliant servers already follow (eg Github API) .
I understand that changing the spec should not be taken lightly and that's in everyone's interest that the specs remain stable as much as possible. That said, I think it's worth making that minor change because:
If you are reluctant to making change B. (or A.), can we at least include a note that says that the pagination behavior is not well specified when neither
first
norlast
is set? This would be similar the note about when bothfirst
andlast
are included and it would make it easier for libraries and servers to justify deviating lightly from this implicit behavior.I would love to hear your thoughts. Cheers