facebook / relay

Relay is a JavaScript framework for building data-driven React applications.
https://relay.dev
MIT License
18.41k stars 1.83k forks source link

Spec enhancement: Clarify pagination algorithm when neither `first` nor `last` is set #2944

Open jnak opened 4 years ago

jnak commented 4 years ago

Hi everyone,

I just realized the pagination algorithm spec implicitly says that if neither first nor last is set, it should return ALL the edges in the connection:

EdgesToReturn(allEdges, before, after, first, last):

  • Let {edges} be the result of calling {ApplyCursorsToEdges(allEdges, before, after)}.
  • If {first} is set:
    • If {first} is less than 0:
      • Throw an error.
    • If {edges} has length greater than than {first}:
      • Slice {edges} to be of length {first} by removing edges from the end of {edges}.
  • If {last} is set:
    • If {last} is less than 0:
      • Throw an error.
    • If {edges} has length greater than than {last}:
      • Slice {edges} to be of length {last} by removing edges from the start of {edges}.
  • Return {edges}.

ApplyCursorsToEdges(allEdges, before, after):

  • Initialize {edges} to be {allEdges}.
  • If {after} is set:
    • Let {afterEdge} be the edge in {edges} whose {cursor} is equal to the {after} argument.
    • If {afterEdge} exists:
      • Remove all elements of {edges} before and including {afterEdge}.
  • If {before} is set:
    • Let {beforeEdge} be the edge in {edges} whose {cursor} is equal to the {before} argument.
    • If {beforeEdge} exists:
      • Remove all elements of {edges} after and including {beforeEdge}.
  • Return {edges}.

I feel it completely defeats the performance and safety purpose of exposing a connection over a regular GraphQL list. My main worry is it takes more effort (and knowledge) to construct a query that does the right thing (ie setting first/ last) than the wrong thing (ie not passing any argument). It feels this behavior is at odds with Facebook usual design philosophy of having developers "fall in the pit of success". In this case, the most natural / easy thing to do for clients is to request a potentially giant list of nodes at once.

Since most server-side libraries that implement the connection spec try to deliberately stick to the specs, it means that most Relay-compliant servers expose this dangerous and confusing behavior without being aware of it. As a result at my work we had to fork Graphene, the main GraphQL implementation in Python, to prevent GraphQL new comers to bypass pagination by accident.

Similarly, major real-world servers implementing the Relay spec had to purposefully break this spec in order to prevent users from issuing very expensive requests. For example, Github's API returns a MISSING_PAGINATION_BOUNDARIES with the the message "You must provide a 'first' or 'last' value to properly paginate the '<connection_name>' connection."

It's not the role of the Relay spec to specify how servers should protect themselves from expensive / abusive queries but we should not have to break the Relay spec to do so. My hunch is that this behavior was an oversight because the spec never explicitly mentions it. For example, the hasPreviousPage and hasNextPage algorithms imply that there are only 2 types of queries: the ones "paginating with last / before" and the ones paginating with first / after.

Assuming it's indeed not deliberate design decision, I see 2 possible ways to enhance the pagination algorithm spec:

A. If {first} is not set and {last} is not set, return an empty list.

While it makes sense from an API design perspective (ie "return 0 edge when none are requested"), it has a few drawbacks:

B. If {first} is not set and {last} is not set, throw an error.

That's my favorite approach because it addresses all the issues of A. and it is a very straightforward change to implement in servers and libraries. And my guess is that it's the behavior a lot of real-world Relay-compliant servers already follow (eg Github API) .

I understand that changing the spec should not be taken lightly and that's in everyone's interest that the specs remain stable as much as possible. That said, I think it's worth making that minor change because:

If you are reluctant to making change B. (or A.), can we at least include a note that says that the pagination behavior is not well specified when neither first nor last is set? This would be similar the note about when both first and last are included and it would make it easier for libraries and servers to justify deviating lightly from this implicit behavior.

I would love to hear your thoughts. Cheers

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.