agola-io / agola

Agola: CI/CD Redefined
https://agola.io
Apache License 2.0
1.52k stars 117 forks source link

Add pagination to list APIs. #422

Open sgotti opened 1 year ago

sgotti commented 1 year ago

Current "list" APIs (with few exceptions) return all the results. This was meant to be only temporary since a sort of pagination is always needed to avoid returning too much data.

There are different options:

I'll prefer cursor based pagination (like already done for user/project get runs APIs).

NOTES

This will break current API but the APIs are not yet stable.

Additional required Actions

alessandro-sorint commented 1 year ago

We can use a Cursor with this data

StartID    string
PointsNext bool
OrderBy    string
SortOrder  SortOrder
Filters    map[string]string
Limit      int

StartIDcontain the starting data id(excluded) to fetch PointsNextif the user ask for next page is true Filterscan contains some filter, else it is empty

When the client will call the gateway API it will can provide some query values like limit, orderby, asc and some filters(if expected), for example: /projects?orderby=name&asc&limit=10 or /projects?limit=10 There are some APIs(example /runs?start&limit=10) that use start query value that, if we implement cursor pagination, we can remove start(because it can use cursor to ask next page): @sgotti do you think is better to maintain start or remove it?

There are some agola cmd list. We should also add here the cursor asking him in the args. Do you can confirm?

The APIs will return the data and pagination for example:

{
   Orgs           []*OrgResponse
   PaginationInfo *htypes.PaginationInfo 
}

where PaginationInfo contains:

{
   NextCursor string
   PrevCursor string
}

The client will use for the next calls only NextCursor for ask the next page or PrevCursor for ask the previous page, if the values ​​are present(for example the first page don't have PrevCursor). NextCursor and PrevCursor contain a base64 string of the Cursor described above.

The Gateway take the query values(limit, orderby,ecc.), or the cursor(that contains all the data needed), and use them to call the configstore/runservice. For example /projects?cursor=eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0%3D

The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors. The Gateway only copy the PaginationInfo to the reponse(and the data).

There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.

sgotti commented 1 year ago

PointsNextif the user ask for next page is true

I'm not sure I understand this.

we can remove start(because it can use cursor to ask next page)

We could also keep start if useful for some api to fetch from a specific id with specific option. Then for the next results the cursor could be used.

There are some agola cmd list. We should also add here the cursor asking him in the args. Can you confirm?

The APIs will return the data and pagination for example:

{
   Orgs           []*OrgResponse
   PaginationInfo *htypes.PaginationInfo 
}

where PaginationInfo contains:

{
   NextCursor string
   PrevCursor string
}

The output should contain a cursor. But there will only be one cursor. It doesn't makes sense to have a prev or next cursor. It'll just be a cursor that point to the next batch of data based on the original query. If the original query has a sort order the cursor will use that.

The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors.

It's the gateway that manages the exposed API and so it MUST also manage the cursor, since it could call multiple underlying services to create a response. The underlying services don't require a cursor since they aren't exposed.

There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.

I don't see why this won't work. Can you explain it?

Remember that if you're going to implement this, also agola-web should be changed and the related PRs should be created to be able to test them.

alessandro-sorint commented 1 year ago

I remove PointsNext since we maintain only one cursor for the next page, so the cursor contains this data:

StartID    string
OrderBy    string
SortOrder  SortOrder
Filters    map[string]string
Limit      int

The response of the gateway can be, for example:

{
   Orgs           []*OrgResponse
   Cursor string
}

or we can create a generic response like this:

{
   Data any
   Cursor string
}

What do you like is better?

We keep start for some api where is already implemented.

@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user. For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=

sgotti commented 1 year ago

Filters map[string]string

Don't be rigid on this. Filters, but also the other fields could be different based on the api.

What do you like is better?

The latter doesn't make sense.

@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user. For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=

By default the client could list all the entries by fetching all of them doing multiple calls to the api.

alessandro-sorint commented 1 year ago

Filters map[string]string

Don't be rigid on this. Filters, but also the other fields could be different based on the api. @sgotti I can define the Cursor as map[string]interface{}

sgotti commented 1 year ago

@sgotti I can define the Cursor as map[string]interface{}

There's no need to make cursor the same type for every api.

alessandro-sorint commented 1 year ago

@sgotti when the user call the last page do you think is better to return an empty cursor(so it know there are no other pages to ask) or the cursor with the last data id(the same as the other calls)?

sgotti commented 1 year ago

@sgotti when the user call the last page do you think is better to return an empty cursor

I'll return an empty cursor.

In the first case the gateway must call the configstore/runservice APIs with a limit+1 to check if there are other pages.

The underlying services could return if there's more data themself.

alessandro-sorint commented 1 year ago

@sgotti with the APIs list secrets/variables I think the pagination can not work when the client use the removeoverridden parameter, because this filter is implemented by the gateway. For example we have this Secrets:

{
   "name": "secret1","parent_path": "user/user01",
   "name": "secret1","parent_path": "user/user01/subgroup1",
   "name": "secret1","parent_path": "user/user01/subgroup2",
   "name": "secret2","parent_path": "user/user01",
}

If the client use parameter limit=2 the Gateway should return {"name": "secret1","parent_path": "user/user01"} and {"name": "secret2","parent_path": "user/user01"}, but calling the configstore whith limit=2, and next making the overriden filter it will return only {"name": "secret1","parent_path": "user/user01"} that is wrong(it should return also {"name": "secret2","parent_path": "user/user01"}). We should avoid this implementing the limit by the gateway(instead of by the configstore), but it can be much complex and with too many calls to the configstore.

I suggest to avoid/ignore limit when the client use removeoverridden parameter, or avoid limit for this APIs(secrets/variables).

sgotti commented 1 year ago

@alessandro-sorint I'm not sure I understand. If the api returns reproducible output as it should, then the cursor will contain the information on the next batch of data. How you get them is an implementation detail.

Limit int

I don't think Limit should be part of the cursor but should be set every time by the call and on the server side there'll be an hard limit.