camaraproject / WorkingGroups

Archived repository used previously by CAMARA Working Groups.
https://wiki.camaraproject.org/display/CAM/CAMARA+Working+Groups
42 stars 60 forks source link

Pagination ... semantic clarification #142

Closed patrice-conil closed 1 year ago

patrice-conil commented 1 year ago

Hi , I have some issues with the way we handle pagination. The semantics of per_page, page and seek are not very clear...as far as the example we provide. With a per_page=20 and a page=10... I imagine retrieving elements from the 200th but the second example says they will start from the 10th. Is seek not redundant if we consider (per_page * page) is the first element we ant to retrieve?

Maybe a simple definition with two parameters: offset and limit should be clearer. offset = index of first element to retrieve limit = max number of element to retrieve

They are also typos is the example (petition => pagination, and returnss => returns), see below.

================ Sample in the doc =============== Petitions examples: page=0 per_page=20, which returnss the first 20 resources page=10 per_page=20, which returns 20 resources from the 10th element.

RubenBG7 commented 1 year ago

seek is focused to inform the last read item identifier in a store with a high number of records. Example: page=10&per_page=100&seek=pr125, which returns 100 resources from the 10th page and last item read has the id "pr125"

patrice-conil commented 1 year ago

seek is focused to inform the last read item identifier in a store with a high number of records. Example: page=10&per_page=100&seek=pr125, which returns 100 resources from the 10th page and last item read has the id "pr125"

Thanks @RubenBG7, Sorry, but I'm stil confused. Is the first item returned 100x10 = 1000th item or item following pr125 ? If the first item returned is the 1000th I dont understand why we need a seek parameter (we just have to "seek" 1000 items). If we start with item following pr125, I don't understand the meaning of page=10. This is why I prefer offset and limit that are frequently used in TMF API.

RubenBG7 commented 1 year ago

Imagine that you are working on a store with dynamic records, seek allows you to know between request which record id you checked,

Limit and offset queries work same page and per_page.

patrice-conil commented 1 year ago

Thanks @RubenBG7 So if I understand, what you want is to retrieve records following the "seek id" provided in the request. In that case I imagine per_page represent the limit or the max number of record to retrieve from this id but what is page=X used for ? Or maybe you give only the seek paralemeter in the request ?

In all cases I dont understand the sample in the guidelines that tells : => page=10 per_page=20, which returns 20 resources from the 10th element.

patrice-conil commented 1 year ago

@RubenBG7, @shilpa-padgaonkar

I think, what is confusing in the documentation is the parameter naming. If we consider the sample is true: page=10 per_page=20, which returns 20 resources from the 10th element. So page is an absolute offset not related to page size ... It isn't a page number so the name is confusing And documentation is not clear about mixing seek, page and per_page , that is also confusing

Maybe we can choose for each API the most relevant implementation for this API with an appropriate parameter naming GET /items?limit=20&after_id=xj23 20 items after id xj23 for Seek Pagination (we can also user before_id) GET /items?limit=20&offset=0 items 0 to 19 for Offset Pagination GET /items?page=20&per_page=10 items 200 to 209 for Page Pagination

What do you think about?

aeftef commented 1 year ago

@RubenBG7 , @patrice-conil It seems to me we are having a bit of a misunderstanding, let me explain briefly

  1. First I agree that the example could be clarified, the index to retrieve should be calculated as seek+(page*_perpage), for page:10, _perpage:20 that means 20 elements starting from the 200th position(offset=200 sql speaking), in the example when it refers to the 10th its talking about the 10th page. I agree it could be more clear and I will propose its to be changed.
  2. In this thread we have confused the meaning or seek field, as its meaning is the first index form where to retrieve the first page (we can consider it has the same meaning as offset). If we want to start the pagination on a specific registry, not by index, but by value, we could use the Content-Last-Key header for that. (retrieving by value has its use cases as it could improve performance in some cases).
  3. I agree with the idea that the same pagination result could be reached just by using offset and limit versus page, _perpage and seek. But what we are looking here is to provide flexibility and usability for the client. For instance, the natural way of working for some clients could be to retrieve a page, and after that just ask for the next and next pages, just incrementing page number. The alternative is that the client should do the arithmetic to calculate the next index/offset (just last offset+limit). Personally I prefer current approach as it leaves more freedom to the clients, but the other approach its also reasonable.
patrice-conil commented 1 year ago

@aeftef , @RubenBG7 Thanks for clarifying the proposal...it was really too clever for me. :) I agree that sometime we need "seek pagination" for very large dynamic datasets. In this cases, I've generally used index-to-index navigation (more in the manner of Twitch or Twitter API) and not mixed index/page navigation, but that's pretty interesting. I don't know if this will be the case in our Camara use cases but maybe you could give me some examples where we will have very large datasets (in MEC APIs?) Maybe start_after or after_index could be less confusing than seek and it will open the door for reverse order (start_before/before_index).

Have a nice week-end.

RubenBG7 commented 1 year ago

Clarifications applied by @aeftef in PR #166