ipfs / pinning-services-api-spec

Standalone, vendor-agnostic Pinning Service API for IPFS ecosystem
https://ipfs.github.io/pinning-services-api-spec/
Creative Commons Zero v1.0 Universal
100 stars 27 forks source link

API v2 proposal: remove PinResults.count #86

Open lidel opened 2 years ago

lidel commented 2 years ago

Prompted by https://github.com/ipfs/ipfs-webui/issues/1900 – this is BREAKING CHANGE for v1 API, so marking this as something we can discuss if we ever want to create v2.

PinResults.count is used for two things:

(1) Reading pin count per status (used in ipfs pin remote service ls --stat):

GET /pins?limit=1&status=queued
GET /pins?limit=1&status=pinning
GET /pins?limit=1&status=failed
GET /pins?limit=1&status=pinned

(2) pagination (where it acts as an equivalent of more: true:

If the value in PinResults.count is bigger than the length of PinResults.results, the client can infer there are more results that can be queried.

Problem

Proposed change

This would be a breaking change that requires v2.x.x of this spec and coordinated effort across the stack: go-ipfs, js-ipfs, ipfs-webui, ipfs-desktop and Pinata.

Is it really breaking? Needs analysis.

I've been thinking a bit about a way to do this in backward compatible way and we could have PinResults.count as an optional field, and always return 1 so the pagination and health checks in old clients still works.

:point_right: Before we consider doing this, the main question is how old client in go-ipfs will react to an unexpected PinResults.more – if it is ignored and nothing crashes, then we have a better path forward.

If not, we would break our users, and need to coordinate go-ipfs/ipfs-webui/ipfs-desktop and Pinata to make the change around the same time to limit the damage. TBD is this is acceptable, considering the performanc benfits in the long run.

cc @obo20 @aschmahmann

aschmahmann commented 2 years ago

If not, we would break our users, and need to coordinate go-ipfs/ipfs-webui/ipfs-desktop and Pinata to make the change around the same time to limit the damage. TBD is this is acceptable, considering the performanc benfits in the long run.

Can we not just add support for a v2 (and maybe a versions) endpoint and avoid breaking changes here? IIRC when it was previously discussed that we wouldn't use libp2p and leverage multistream for protocol negotiation, but instead use HTTP the claim was that having version negotiation would be similarly easy.

lidel commented 2 years ago

Yes, if we every do this, then we would introduce this breaking change as v2 + versioning scheme. I've updated the title of this issue + created topic/v2 for changes like this one.

So far this proposal got no feedback from any pinning service – let's park this until there is interest and/or we gather more items under topic/v2

guseggert commented 2 years ago

Besides being expensive, looking at the count while paginating has problematic edge cases, e.g. pins being added/deleted while paginating.

If we're going to make a breaking change, we should consider using an opaque pagination token instead of the created timestamp, which makes less assumptions about the capabilities of the underlying DB and can include arbitrary info to help the DB paginate better.

obo20 commented 2 years ago

@lidel just now seeing this and definitely supporting this change. Count is super expensive for larger pinsets and can cause a lot of issues at scale.

I'm in favor of moving towards a created timestamp paradigm here. We had to switch our normal pinList endpoint to using this paradigm as the previous method which required counting was becoming extremely expensive for larger users.

obo20 commented 2 years ago

@lidel I would also add here that getting rid of count would allow us to raise our rate limits, which is something I know the PL team / ipfs-desktop users have been asking for on this specific endpoint

SgtPooki commented 2 years ago

I think we should remove count, and use an opaque token like "nextToken". Gus said it best:

If we're going to make a breaking change, we should consider using an opaque pagination token instead of the created timestamp, which [the pagination token] makes less assumptions about the capabilities of the underlying DB and can include arbitrary info to help the DB paginate better.

obo20 commented 2 years ago

We could pretty easily implement things right away if we did time based pagination, as we already switched the rest of our application over to doing this, but I'm guessing opaque token pagination might be a lot bigger of a lift.

@SgtPooki @guseggert could you both elaborate on what you have in mind here? Any example articles you could point to?

SgtPooki commented 2 years ago

@obo20 absolutely. First here's a quick summary in my own words:

I think of pagination tokens like nextToken as a sort of ephemeral API key for a search/query. Basically, it's a pointer(see: cursor in cursor-based-pagination) to a certain record. You could map this back to a time-based token if that's how your service is implemented. When you receive a request with a nextToken, you know exactly where to start searching, instead of always searching/filtering through all records for each query.

Those records can be sorted/filtered however you wish, and you can encode any sorting/filtering/pages/users/etc.. within that token. This flexibility is great, but does come with drawbacks that I've personally experienced and complained about plenty during my time at Amazon, but for service owners, it gives you a lot more control over how your resources are consumed no matter how many consumers or objects to consume you have.

I believe @guseggert implemented more than a few APIs at Amazon where we were required to use this pagination strategy when dynamoDB was our backend (it was most of the time), so he can correct me or speak further to the details if you need more info from someone who's got more experience.

Below are some great articles I've found that should help the discussion here for anyone interested. The writeup by slack is a fantastic read that goes into the pros and cons of many approaches.


Apollo GraphQL

Not as good as the others, but informative in a basic sense. Very specific to apollographql

Since numeric offsets within paginated lists can be unreliable, a common improvement is to identify the beginning of a page using some unique identifier that belongs to each element of the list.

If the list represents a set of elements without duplicates, this identifier could simply be the unique ID of each object, allowing additional pages to be requested using the ID of the last object in the list, together with some limit argument. With this approach, the requested cursor ID should not appear in the new page, since it identifies the item just before the beginning of the page.

--- https://www.apollographql.com/docs/react/pagination/cursor-based/

hackernoon article

Great writeup covering unidirectional and bidirectional pagination from the view of an engineer.

For unidirectional pagination, my preferred approach is to use a simple cursor. The important detail here is to make the cursor meaningless.

As far as the client is concerned, it’s just a blob the server returns in the response when there are more results to fetch. The client shouldn’t be able to derive any implementation details from it, and the only thing it can afford to do with this cursor is to send it along in the next request.

--- https://hackernoon.com/guys-were-doing-pagination-wrong-f6c18a91b232

Slack

An extremely thorough writeup of slack's considerations when upgrading their pagination API

Cursor-based pagination works by returning a pointer to a specific item in the dataset. On subsequent requests, the server returns results after the given pointer. This method addresses the drawbacks of using offset pagination, but does so by making certain trade offs:

  • The cursor must be based on a unique, sequential column (or columns) in the source table.
  • There is no concept of the total number of pages or results in the set.
  • The client can’t jump to a specific page.

--- https://slack.engineering/evolving-api-pagination-at-slack/

guseggert commented 1 year ago

At a high level, you can encode whatever state you need to fetch the next page in the pagination token (you might want to encrypt it too, so clients can't reverse it). E.g. for a relational database, you can encode a JSON object like {"limit":20,"offset": 100}, for key-value stores that paginate with their own tokens, use that token. Some folks need to add extra data like a timestamp, so that the token will eventually expire.

Concretely, GET /pins would return something like:

{
    "nextToken": "eyJsaW1pdCI6MjAsIm9mZnNldCI6IDEwMH0K",
    "results": [...] 
}

and then the subsequent request would be like GET /pins?nextToken=eyJsaW1pdCI6MjAsIm9mZnNldCI6IDEwMH0K, repeat until you decide to stop or nextPageToken is null.

The downside from a client's perspective is that you can't parallelize queries. That's an upside from a server's perspective, and generally a desirable thing for a multitenant API. But I'm not sure how this would impact existing impls.

Essentially though, I think we should decouple the query params / filters from the pagination mechanism, so that we don't need to make assumptions about the query capabilities of the DB in order to paginate, and the most general way that I know of to do that is with an opaque token.

obo20 commented 1 year ago

@SgtPooki @lidel did we want to make official the decisions from ipfs-thing?

lidel commented 1 year ago

iirc we were discussing backward-compatible ways of removing the cost of returning exact count value to avoid breaking existing clients.

Removing or changing type of count will break existing clients which use it for pagination (docs). However, if we keep it, but relax the requirement around its value to be something like "equal to results or results+1 (or MAX_INT?) to indicate there are more records" then you no longer need to pay the cost of calculating the exact size.

Next steps here (I have no bandwidth for this, but happy to review PRs):

  1. propose a PR against this repo that updates Pagination-and-filtering section, making it clear pinning services can return approximate count (like results+1) when the cost of calculating real one is too high.
  2. test if https://github.com/ipfs/go-pinning-service-http-client (used by Kubo) will correctly paginate if count value is no longer exact.