Closed jmrodri closed 6 years ago
@pmorie here is the first cut at a proposal for supporting pagination in the catalog api.
Hello @jmrodri. Thanks for your proposal.
According to #122 proposals should be written in a Google Doc and linked from an issue. The reason is that docs make collaborating, writing notes and proposing changes to the spec easier. That is how previous proposals have been handled.
Also notice that spec changes should be validated through implementation once the design has been agree upon (step 3).
@angarg12 I will write up the Google doc now and share the link in this issue. I apologize for not reading #122 first.
No apologies needed! Thanks for getting involved and contributing to the project :tada:.
@angarg12 @pmorie here is the google doc containing the proposal. let me know if I need to change anything like permissions etc.
https://docs.google.com/document/d/1P0MTtz5k0BZhi6CxXN9fsqmXBYojFKHxUrspndb6ODM/edit?usp=sharing
It works correctly, thanks.
What would be the next steps with this issue? Do we need a prototype?
You can see the latest contributing procedure here:
If your proposal is finished, then we should move to the validation through implementation.
On additional note, not directly related to pagination. If there are potentially 10k+ entries in the catalog one might need a way to have the broker provide search capabilities as well. I bring it up here because if there are 100s of pages, and only catalog entries on the current page are available to a client platform at a time, it would become to search for specific entries. An example could be a way to search in tags for "caching" or "nosql" to see what options are on offer. That response could of course also include pagination.
@RamakrishnanArun this was brought up in our weekly meeting briefly. I suggest you open a new issue to discuss that topic specificaly, so that everyone can participate.
Added an issue to discuss search on the catalog api in issues #136.
Hey folks, this is starting to come up for some of our use-cases, and I'm wondering if any movement has been made on it since February.
I'd also like to suggest a more generic approach to pagination where instead of a field for page number, the pagination scheme use opaque tokens. The service response returns a next_token, and the client passes this token when making further requests.
Brokers can still just use page numbers as the token if they want, or use more advanced tokens like item IDs to better capture where in the pagination the client was.
@bmelville " this is starting to come up for some of our use-cases" can you share more details of the pain it's causing and the use case?
We can then prioritise focusing on this issue.
@bmelville I'd be curious about the opaque token. Admittedly I don't see how that would work, not saying it won't just don't understand it 😄
@avade, we have brokers with potentially hundreds or thousands of service definitions exposed through them. We find that even >10 can be a lot of information to pass, especially with schemas part of the definition now.
@jmrodri, the problem with using just page numbers all the time is that the data behind the scenes can change between requests. So getting page 1 and then asking for page 2 could result in duplicate entries if something that was on page 1 before now moves to page 2 (something was inserted before it).
Using an opaque token could be as easy as the identifier for the next row in the database, such that the next request I make always starts with the next thing I am guaranteed not to have seen on the previous requests. This ensures I'm not getting duplicate entries.
If that doesn't concern you (e.g., your catalog is static), you could always just return the page number as the opaque token, but it allows a broker who needs to guard against it to do more advanced pagination tokens.
A specific use-case we have is a broker with a dynamic catalog, i.e., an administrator or producer can create and manage service definitions dynamically through a CRUD API.
right - we'd just need to be clear about what to do when that token "times out" - so the client knows when to start over.
Is that true even of page numbers? Like when I ask for page 25 but it no longer exists.
its up to the broker to decide if/when it times out. For a static catalog it may never do so, but a generated token might - depends on how long a broker is willing to remember the tokens. So we just need to define how the broker tells the platform that its "token" is old and needs to start over.
Yes agreed. I was going to suggest the broker return something incredibly funny like 418 I'm a teapot, but I found some that might actually be nice instead, like 416 Requested Range Not Satisfiable or 410 Gone.
Perhaps just 400 Bad Request when this happens would suffice.
It would be really cool if we could fine a reason to use 418 - someplace! :-)
@bmelville out of interest, can I ask why you've gone for the approach of having one broker for hundreds/thousands of services, rather than multiple brokers?
Having many small hand written brokers is okay, but not a great user experience for the person having to plug them all into one or perhaps many consumption environments.
Once you start building dynamic marketplaces around the broker concept though, it becomes advantageous to expose a single broker to both the consumer and the producer side. So you will typically have producers plugging in service definitions (rather than individual brokers) to the marketplace, and having the services available consumption exposed as a single broker.
Adding @pmorie, as I believe we've talked about this in the past and he also has a similar use-case in his platform.
From 11/12 call, @jmrodri will either write-up a proposal or icebox this
I'm going to icebox this. We never hit the sizes I initially thought we would hit requiring pagination. So closing as I don't think it's necessary.
Proposal
The catalog API should support pagination in the cases where there are large numbers of services supported by a broker. We see a potential for 10k-15k catalog entries which could take a bit of time to render to the caller.
Effectively the initial catalog call should expect the Link header or the metadata in the body indicating that the broker should be invoked with pagination.
The API should take the following optional parameters:
cURL
Body (changes in bold)
Response
The
pagination
block would be added to the catalog response at the same level asservices
. In the case of requesting the first page, the response will contain a next and a last item.Alternatively we could use the Link header RFC 5988. In this case there would be NO changes to the response body just the headers.
Subsequent calls will yield 2 new options, first and prev. For example, if you request page 10000 the following response will be returned:
Alternatively we could use the Link header RFC 5988