support pagination in the /catalog API

jmrodri commented 7 years ago

Proposal

The catalog API should support pagination in the cases where there are large numbers of services supported by a broker. We see a potential for 10k-15k catalog entries which could take a bit of time to render to the caller.

Effectively the initial catalog call should expect the Link header or the metadata in the body indicating that the broker should be invoked with pagination.

The API should take the following optional parameters:

parameter name	description
per_page	number of items to show per page
page	page number requested, depends on page size. if > last page, then last page is returned

cURL

   $ curl -H "X-Broker-API-Version: 2.9" http://username:password@broker-url/v2/catalog?page=2&per_page=20

Body (changes in bold)

Response field	Type	Description
services*	array-of-service-objects	Schema of service objects defined below.
pagination	metadata about pagination	Schema for pagination defined below.

Response

The pagination block would be added to the catalog response at the same level as services. In the case of requesting the first page, the response will contain a next and a last item.

{
  "services": [{
     ...
     }],
  "pagination": {
    "next": "https://broker/v2/catalog?page=2",
    "last": "https://broker/v2/catalog?page=15333"
  }
}

Alternatively we could use the Link header RFC 5988. In this case there would be NO changes to the response body just the headers.

Link: <https://broker/v2/catalog?page=2>; rel="next",
      <https://broker/v2/catalog?page=15333>; rel="last"

Subsequent calls will yield 2 new options, first and prev. For example, if you request page 10000 the following response will be returned:

{
  "services": [{
     ...
     }],
  "pagination": {
    "next": "https://broker/v2/catalog?page=10001",
    "last": "https://broker/v2/catalog?page=15333",
    "first": "https://broker/v2/catalog?page=1",
    "prev":  "https://broker/v2/catalog?page=9999"
  }
}

Alternatively we could use the Link header RFC 5988

Link: <https://broker/v2/catalog?page=10001>; rel="next",
      <https://broker/v2/catalog?page=15333>; rel="last",
      <https://broker/v2/catalog?page=1>; rel="first",
      <https://broker/v2/catalog?page=9999>; rel="prev"

jmrodri commented 7 years ago

@pmorie here is the first cut at a proposal for supporting pagination in the catalog api.

angarg12 commented 7 years ago

Hello @jmrodri. Thanks for your proposal.

According to #122 proposals should be written in a Google Doc and linked from an issue. The reason is that docs make collaborating, writing notes and proposing changes to the spec easier. That is how previous proposals have been handled.

Also notice that spec changes should be validated through implementation once the design has been agree upon (step 3).

jmrodri commented 7 years ago

@angarg12 I will write up the Google doc now and share the link in this issue. I apologize for not reading #122 first.

angarg12 commented 7 years ago

No apologies needed! Thanks for getting involved and contributing to the project :tada:.

jmrodri commented 7 years ago

@angarg12 @pmorie here is the google doc containing the proposal. let me know if I need to change anything like permissions etc.

https://docs.google.com/document/d/1P0MTtz5k0BZhi6CxXN9fsqmXBYojFKHxUrspndb6ODM/edit?usp=sharing

angarg12 commented 7 years ago

It works correctly, thanks.

jmrodri commented 7 years ago

What would be the next steps with this issue? Do we need a prototype?

angarg12 commented 7 years ago

You can see the latest contributing procedure here:

https://github.com/duglin/servicebroker/blob/9dd1d36ab5525fe6bb8f7175588cc567d41b5b0d/CONTRIBUTING.md

If your proposal is finished, then we should move to the validation through implementation.

RamakrishnanArun commented 7 years ago

On additional note, not directly related to pagination. If there are potentially 10k+ entries in the catalog one might need a way to have the broker provide search capabilities as well. I bring it up here because if there are 100s of pages, and only catalog entries on the current page are available to a client platform at a time, it would become to search for specific entries. An example could be a way to search in tags for "caching" or "nosql" to see what options are on offer. That response could of course also include pagination.

angarg12 commented 7 years ago

@RamakrishnanArun this was brought up in our weekly meeting briefly. I suggest you open a new issue to discuss that topic specificaly, so that everyone can participate.

RamakrishnanArun commented 7 years ago

Added an issue to discuss search on the catalog api in issues #136.

bmelville commented 7 years ago

Hey folks, this is starting to come up for some of our use-cases, and I'm wondering if any movement has been made on it since February.

I'd also like to suggest a more generic approach to pagination where instead of a field for page number, the pagination scheme use opaque tokens. The service response returns a next_token, and the client passes this token when making further requests.

Brokers can still just use page numbers as the token if they want, or use more advanced tokens like item IDs to better capture where in the pagination the client was.

avade commented 7 years ago

@bmelville " this is starting to come up for some of our use-cases" can you share more details of the pain it's causing and the use case?

We can then prioritise focusing on this issue.

jmrodri commented 7 years ago

@bmelville I'd be curious about the opaque token. Admittedly I don't see how that would work, not saying it won't just don't understand it 😄

bmelville commented 7 years ago

@avade, we have brokers with potentially hundreds or thousands of service definitions exposed through them. We find that even >10 can be a lot of information to pass, especially with schemas part of the definition now.

bmelville commented 7 years ago

@jmrodri, the problem with using just page numbers all the time is that the data behind the scenes can change between requests. So getting page 1 and then asking for page 2 could result in duplicate entries if something that was on page 1 before now moves to page 2 (something was inserted before it).

Using an opaque token could be as easy as the identifier for the next row in the database, such that the next request I make always starts with the next thing I am guaranteed not to have seen on the previous requests. This ensures I'm not getting duplicate entries.

If that doesn't concern you (e.g., your catalog is static), you could always just return the page number as the opaque token, but it allows a broker who needs to guard against it to do more advanced pagination tokens.

A specific use-case we have is a broker with a dynamic catalog, i.e., an administrator or producer can create and manage service definitions dynamically through a CRUD API.

duglin commented 7 years ago

right - we'd just need to be clear about what to do when that token "times out" - so the client knows when to start over.

bmelville commented 7 years ago

Is that true even of page numbers? Like when I ask for page 25 but it no longer exists.

duglin commented 7 years ago

its up to the broker to decide if/when it times out. For a static catalog it may never do so, but a generated token might - depends on how long a broker is willing to remember the tokens. So we just need to define how the broker tells the platform that its "token" is old and needs to start over.

bmelville commented 7 years ago

Yes agreed. I was going to suggest the broker return something incredibly funny like 418 I'm a teapot, but I found some that might actually be nice instead, like 416 Requested Range Not Satisfiable or 410 Gone.

Perhaps just 400 Bad Request when this happens would suffice.

duglin commented 7 years ago

It would be really cool if we could fine a reason to use 418 - someplace! :-)

mattmcneeney commented 7 years ago

@bmelville out of interest, can I ask why you've gone for the approach of having one broker for hundreds/thousands of services, rather than multiple brokers?

bmelville commented 7 years ago

Having many small hand written brokers is okay, but not a great user experience for the person having to plug them all into one or perhaps many consumption environments.

Once you start building dynamic marketplaces around the broker concept though, it becomes advantageous to expose a single broker to both the consumer and the producer side. So you will typically have producers plugging in service definitions (rather than individual brokers) to the marketplace, and having the services available consumption exposed as a single broker.

Adding @pmorie, as I believe we've talked about this in the past and he also has a similar use-case in his platform.

duglin commented 6 years ago

From 11/12 call, @jmrodri will either write-up a proposal or icebox this

jmrodri commented 6 years ago

I'm going to icebox this. We never hit the sizes I initially thought we would hit requiring pagination. So closing as I don't think it's necessary.

openservicebrokerapi / servicebroker