conda-incubator / conda-store

Data science environments, for collaboration. ✨
https://conda.store
BSD 3-Clause "New" or "Revised" License
145 stars 50 forks source link

[ENH] - Ensure completeness when fetching all pages using REST API #859

Open krassowski opened 3 months ago

krassowski commented 3 months ago

Feature description

Currently the pagination API uses page-based pagination without sorting by time of creation/modification. This leads to incomplete results when iterated over the pages in dynamic systems with multiple modifications happening at the same time when results are fetched.

For example, if during the process of iterating over the pages of environments (api/v1/environment) an additional environment is created, either:

this is because the environments are sorted by namespace and name. For a more concrete example see the details below.

Say we have five environments A, B, D, E, F, and page size equals 2. User performs two actions: - requested creating environment called "C", and then - opens the form with a list of all environments If the database commit happens after the second page is fetched, this could lead to the form getting replies: - A, B - first page (ok) - D, E - second page (ok at the time) - E, F - third page - oops! If many users are creating environments, the admin would be randomly missing some environments. If the results were sorted by date of environment was created (/last modified if environments can be renamed), then this is not a problem, because the replies could be (assuming A, B, D, E, F were created in this order): - A, B - D, E - F, C Originally posted in https://github.com/nebari-dev/nebari/issues/2599#issuecomment-2260552463

Instead, one of pagination implementations which guarantees data completeness should be used across conda-store REST API:

Value and/or benefit

No randomly missing items (e.g. environments) in paginated replies.

Anything else?

No response

trallard commented 2 months ago

@peytondmurray was this addressed by the environment fetch enhancement you made recently?

peytondmurray commented 2 months ago

No, this has to do with the pagination machinery, not the underlying fetch. Let's keep this open.

peytondmurray commented 2 months ago

Coming back to this, I think we can most easily fix this by simply sorting paginated results by time. I'm sure it's possible to do with the other proposed methods, but this just seems simplest to me. @trallard and @krassowski do you have preferences about this?

This seems like an easy fix, so once I have your input I'll happily fix this.

krassowski commented 2 months ago

I think we can most easily fix this by simply sorting paginated results by time

Yes, that's one of the proposed methods :)

peytondmurray commented 2 months ago

Okay, with two votes for sorting by time, I'll do this. Stay tuned!