Pagination of related resources

emetsger commented 7 years ago

Let's say you had a simple model of a Book: made up of Chapters and Pages. An example chapter might look like the following, with a relationship to its pages:

{
  "data": {
    "type": "chapter",
    "id": "lotr_rotk_chapter_1",
    "attributes": {
      "number": 1,
      "title": "Minas Tirath",
      "summary": "Gandalf and Pippin arrive in Minas Tirith; they talk with Denethor; Pippin enters the service of the steward."
    },
    "relationships": {
      "pages": {
        "links": {
          "related": "http://example.org/lotr_rotk/chapter_1/pages/"
        }
      }
    }
  }
}

In your Chapter Java class you might have something like:

@Type("chapter")
public class Chapter {

  @Relationship("pages")
  private List<Page> pages;

  // accessors

You retrieve an instance of Chapter using the jsonapi-converter.

The question is, what would chapter.getPages().size() return if the server paginated the response, and only returned 10 page objects at a time:

{
  "data": [
    {
      "type": "page",
      "id": "lotr_rotk_page_1",
      "attributes": {
        "number": 1,
        "text": "Pippin looked out from the shelter of Gandalf's cloak. He wondered if he was awake or still sleeping..."
      }
    },
// more pages ...
    {
      "type": "page",
      "id": "lotr_rotk_page_10",
      "attributes": {
        "number": 10,
        "text": "Gandalf passed now into the wide land beyond the Rammas Echor..."
      }
    }
  ],
  "links": {
    "first": "http://example.org/lotr_rotk/chapter_1/pages/",
    "last": "http://example.org/lotr_rotk/chapter_1/pages/?page=100",
    "next": "http://example.org/lotr_rotk/chapter_1/pages/?page=2",
    "prev": null,
    "self": "http://example.org/lotr_rotk/chapter_1/pages/"
  },
  "meta": {
    "total": 100,
    "per_page": 10
  }
}

My understanding is that RelationshipResolver#resolve("http://example.org/lotr_rotk/chapter_1/pages/") would be invoked by the ResourceConverter when parsing the chapter JSON, and it would only get the first page of results.

Do you have any thoughts on how retrieving relationships that may be paginated might work?

emetsger commented 7 years ago

Just thinking out loud...

It's not really feasible to implement a custom RelationshipResolver because it simply returns a byte array. Even if you were to deserialize the byte array into an object and recursively retrieve all the pages, you'd have to reduce all of the pages into a single byte array which seems sub-optimal.

Another place to handle this might be in ResourceConverter# handleRelationships(JsonNode source, Object object) starting around line 458, where there's an opportunity to look at the JSONAPIDocument returned by readDocumentCollection(...): you might recurse and retrieve all pages, but then you'd have to reduce all the results into a single JSONAPIDocument.

Or, you may say, this isn't the responsibility of the jsonapi-converter at all, it is the responsibility of the calling application to handle paginated results. And I could appreciate that sentiment, but it doesn't look like the jsonapi-converter provides a means for retrieving paginated results. I would need to store the pages relationship as a String, (instead of List<Page>), and retrieve the relationship on my own.

emetsger commented 7 years ago

Hm, maybe ResourceConverter#handleRelationships(...) is the right place: https://gist.github.com/emetsger/7840d1ec13223f19e1fa4d9bc43fe19f

jasminb commented 7 years ago

Hey @emetsger,

All your points are valid, current state of the lib does not allow for paginated relationship handling.

Lib should allow both options: eager fetching (your solution) and a way to get paging info in order to allow users to paginate.

Ideally tho, lib could proxy the relationship object and abstract away the pagination. Proxy would allow for iteration but it has its problems, eg. size() would require fetching everything.

Taking a step in right direction would be allowing for eager fetching in order to allow users to use the lib in this case, however I would prefer to have the user decide if eager fetching should be used or not (there could be large pagination results that could crash clients).

If you feel like making a PR, it would be awesome, or we can discuss further to come up with better solution.

emetsger commented 7 years ago

Hi @jasminb,

Yes, I initially used an eager approach in my fork, but in the end I implemented a lazy approach. I don't think we will be able to use my code as-is for a number of reasons†, but perhaps the concepts could be adapted into a suitable solution for inclusion into your library.

I think there are a couple of things to consider, as you point out above:

there would need to be some abstraction or pluggable strategy for parsing pagination information -including the total size of the collection, the number of results per page, and offset into the results
allow the client to select eager vs. lazy fetching

In my naive implementation, I wrapped ResourceList with a pagination-aware List implementation that uses a pagination-aware Iterator for obtaining subsequent pages.

A couple of comments/observations about the implementation: 1) I'm working with a JSONAPI implementation that provides two fields in the meta section of a top-level links object: a total field with the total number of results in the collection, and a per_page field, which tells you the maximum number of results retrieved per page. (note the spec offers some recommendations on pagination metadata) 2) sequential forward traversal of results is possible, even without knowing the total size of the collection (see testPaginationWithNoTotalOrPerPage()). As long as a next link is present, subsequent requests can be made for the next page. 3) in order to traverse the results forward or backward, a page would have to provide an offset into the results 4) similarly, an offset would aid a proper implementation of List when a specific page of results is desired (e.g. directly retrieving page 3 of a collection vs retrieving page 1 and traversing to it).

In summary, the functionality of a lazy-fetch implementation will depend on what pagination-related metadata the response carries.

† because 1) it uses Java 8, 2) it lacks abstractions for parsing pagination metadata from the response, 3) the PaginatedResourceList implementation makes some simplifying assumptions, 4) PaginatedResourceList gets a reference to ResourceConverter (I'm using an older version of the library) which may be sub-optimal- I'm not sure of the thread-safety of the class, but it seemed OK to share.

jasminb / jsonapi-converter

Pagination of related resources #104