TTTaevas / osu-api-v2-js

Package to easily access the new api of osu!
https://osu-v2.taevas.xyz
The Unlicense
6 stars 0 forks source link

Requests cannot be cancelled #33

Closed TTTaevas closed 6 months ago

TTTaevas commented 6 months ago

As discussed previously in #32, it would be a cool feature if users of this package could cancel any request at any point And as far as I am aware, given that fetch() is used to make requests, the only way to do that would be to use an AbortController's AbortSignal So the question now is, how can users access a request's AbortController or AbortSignal?


Proposal 1, allow users to give an AbortController as an argument in essentially every method

That would mean changing every currently existing methods to allow this additional argument, which would make things ugly really fast, I don't think I would want to do that

// getUserBeatmaps(), before
export async function getBeatmaps(this: API, user: User["id"] | User, type: "favourite" | "graveyard" | "guest" | "loved" | "nominated" | "pending" | "ranked", config?: Config): Promise<Beatmapset.Extended.WithBeatmap[]> {
    return await this.request("get", `users/${getId(user)}/beatmapsets/${type}`, {limit: config?.limit, offset: config?.offset})
}

// getUserBeatmaps(), after, assuming `API.request()` changes to support a controller
export async function getBeatmaps(this: API, user: User["id"] | User, type: "favourite" | "graveyard" | "guest" | "loved" | "nominated" | "pending" | "ranked", config?: Config, controller?: AbortController): Promise<Beatmapset.Extended.WithBeatmap[]> {
    return await this.request("get", `users/${getId(user)}/beatmapsets/${type}`, {limit: config?.limit, offset: config?.offset}, controller)
}

Proposal 2, have the api object list its ongoing requests along with AbortControllers

Roles would be switched, as the api would be providing the AbortControllers to the user, but it would not be possible to identify a specific request's controller with 100% certainty; The user would be able to cancel any request fitting a rather broad condition but not a very specific request In other words, it'd be unreliable

The API would likely have a property (array) called current_requests and API.request() would take the sole responsibility of adding and removing objects that have things like the route, the parameters, the AbortSignal...


Proposal 3, make a copy of the api object and give it special properties, such as a specific AbortController

Users would be the ones providing the AbortControllers, and would do so by calling another method in which they can give their AbortController; that function will create a copy of the api object and give said AbortController to it, by changing its (new) controller property

// in the API class
public with(args: {controller: AbortController}): API {
    let new_api = new API({...this})
    new_api.controller = args.controller
    return new_api
}
// using the same controller for multiple requests
const api = await osu.API.createAsync({id, secret})
const controller = new AbortController()
const special_api = api.with({controller}) // `api` doesn't have the controller, but `special_api` does
const user = await special_api.getUser(7276846)
const beatmap = await special_api.getBeatmapset(1971037)

// using a specific controller for a single request
const api = await osu.API.createAsync({id: Number(id), secret})
const controller = new AbortController()
const user = await api.with({controller}).getUser(7276846)

Maybe with could support other arguments, like custom headers, turn on retry on timeout, other fetch options... Either way, it means creating a whole new api object everytime someone wants to use a new controller, which does not sound very optimal to be honest

It's without mentioning how it would interact with refresh_on_expires which by default does a setTimeout on the older api but would not actually do it on the newer special_api

It may be a good option if more thoughts are put into it


All of the above either gives an AbortController to the user or lets the user give an AbortController, but if there is a way around using those, then that way would probably be worth considering

baileyherbert commented 6 months ago

Hi @TTTaevas,

Thanks for opening this, and apologies that I took a few days to show back up.

First, I want to make an important correction. In all cases, you will actually want to accept an AbortSignal object, rather than an AbortController. This is consistent with other packages and libraries, including fetch itself, as the user won't always have a controller on hand. I would therefore expect the code in proposal 3 to look like this:

const controller = new AbortController()
const special_api = api.with({ signal: controller.signal });

Now onto the three proposals:

Proposal 1 is how a typical API library would tackle this problem in languages that emphasize cancellation, like C#. However, the number of optional parameters in some of the methods makes this unappealing. IMO, this would only make sense if each method accepted a single object parameter for all of its inputs, and a second parameter for request-specific options, headers, etc.

Proposal 2 I would not recommend. The thought of fishing to find the correct cancellation token is quite unappealing, the majority of use cases will probably want to see a single AbortSignal for all requests, and I've never seen another API library implement cancellation that way.

Proposal 3 is exactly the approach that I was going to suggest, and is the way that I myself often solve these kinds of problems in the NodeJS ecosystem. A method to "clone" an API instance, and to customize the options on that clone, while still inheriting the same credentials and sharing token renewal with the original instance, would be the cleanest approach.

A possible implementation for the third proposal could be to add an overrides object parameter to the internal request() method. Then, create a new class like class ChildAPI extends API which takes the original instance, disables authentication, and overrides the request() method to call original.request(..., this.overrides), where this.overrides is the object passed into api.with(). This way, the same underlying credentials and token renewal processes are shared, as the original instance is still executing the requests.

The ChildAPI class would then have some unused properties, but I think this is a worthy sacrifice for keeping it compatible with the base API type, which could be important for TypeScript users in some circumstances. The properties could be changed to getters that directly reference the original instance's values, the with() method could call original.with(), and so on.

Because fetch() only accepts a single signal, this will conflict with your timeout implementation. However, the timeout should still work, unless the user sets it to 0. The two signals should instead be merged together separately for each request. This is typically achieved with AbortSignal.any([s1, s2]), but unfortunately that requires Node.js v20.

Here's a more widely compatible helper function to merge signals, sourced from whatwg/fetch#905:

/**
 * Returns an abort signal that is getting aborted when
 * at least one of the specified abort signals is aborted.
 *
 * Requires at least node.js 18.
 */
export function anySignal(
  ...args: (AbortSignal[] | [AbortSignal[]])

): AbortSignal {

  // Allowing signals to be passed either as array
  // of signals or as multiple arguments.
  const signals = <AbortSignal[]> (
    (args.length === 1 && Array.isArray(args[0]))
      ? args[0]
      : args
  );

  const controller = new AbortController();

  for (const signal of signals) {

    if (signal.aborted) {
      // Exiting early if one of the signals
      // is already aborted.
      controller.abort(signal.reason);
      break;
    }

    // Listening for signals and removing the listeners
    // when at least one symbol is aborted.
    signal.addEventListener('abort',
      () => controller.abort(signal.reason),
      { signal: controller.signal }
    );

  }

  return controller.signal;

}

Since most use cases will generally center around a single user-provided signal, I would not consider this 'duplication' of API instances to be particularly troublesome, so long as they share credentials and token renewal with the original instance, including the background renewal timeout.

The only real concern, and it is a very minor one, is the way the various "endpoint collections" are attached to the API client as properties, e.g. starting here:

https://github.com/TTTaevas/osu-api-v2-js/blob/master/lib/index.ts#L576

TypeScript moves those property assignments into the constructor, which means those are not an inherent part of the API prototype, and this will create some slight extra work for the CPU each time an API instance is cloned.

I highly doubt this has any meaningful performance impact, but it is a potential target for optimization if you are concerned about these duplicate API instances being "heavy." A more performant alternative would be to assign those functions on the prototype itself, e.g.

API.prototype.lookupBeatmap = Beatmap.lookup;

TypeScript will resist this approach a bit, but it is very doable.

Let me know your thoughts?

TTTaevas commented 6 months ago

Hello! There's no need to apologize, your help is very appreciated Forgive me if any of the following is wrong, I must admit those are concepts I'm pretty much unfamiliar with and I may need some more sleep

In all cases, you will actually want to accept an AbortSignal object, rather than an AbortController. [...] as the user won't always have a controller on hand.

I can't help but feel like it's backwards, why would the user give a signal if they don't have a controller to abort it with? Furthermore, if the API stored the received controller, then the user would not need to have it on hand anymore

Because fetch() only accepts a single signal, this will conflict with your timeout implementation. However, the timeout should still work, unless the user sets it to 0. The two signals should instead be merged together separately for each request.

Didn't think of that, good call there! I don't think there'd be any timeout implementation to circumvent the usage of multiple signals, because if two requests are made with the same controller, and one of the requests timeout, then that would cancel the other request, so yep multiple signals are needed

However, I'd like to point out that the code you've shared (somehow?) requires Node.js to be at least version 18! The other bit of code in the comment you've linked might fit better, which goes like:

function anySignal(signals: AbortSignal[]): AbortSignal {
  const controller = new AbortController();

  for (const signal of signals) {
    if (signal.aborted) {
      controller.abort();
      return signal;
    }

    signal.addEventListener("abort", () => controller.abort(signal.reason), {
      signal: controller.signal,
    });
 }

  return controller.signal;
}

Otherwise, it would mean dropping support for older versions of Node.js (that'd mean in turn dropping node-fetch and probably get to using AbortSignal.timeout())

I highly doubt this has any meaningful performance impact, but it is a potential target for optimization if you are concerned about these duplicate API instances being "heavy." A more performant alternative would be to assign those functions on the prototype itself

If I'm not misunderstanding, wouldn't the only way to achieve that be to essentially assign those functions twice, with that bit of code you've shared being right outside the class block? For the time being, I think I'm not that desperate for performance, but it's still a good thing to know!

Putting aside all of the above, the implementation with the ChildAPI sounds great to me! :D

baileyherbert commented 6 months ago

I can't help but feel like it's backwards, why would the user give a signal if they don't have a controller to abort it with?

It is possible to obtain a signal without having a controller:

As an HTTP client, the only thing you need to do is listen for cancellations, and that's exactly what the AbortSignal is for. Therefore, the convention is to only ask for the signal. This is true in Node.js, and I know it is also true in C# where only CancellationToken objects are passed around, which are directly equivalent to AbortSignal.

Furthermore, if the API stored the received controller, then the user would not need to have it on hand anymore

I appreciate what you're trying to do here (from a "developer experience" or utility perspective) but I don't think it's necessary and they won't always have a controller. Here's a counter-argument to the utility side of it:

I personally use AbortController and AbortSignal frequently. However, all of the packages I've used before, in addition to native JS features like fetch, have only asked for the signal, so that's all I pass around in my application. If I suddenly had a package asking for the entire controller, I'd need to do some serious revamping to pass around the controller too, which is not great considering the package won't even use the controller in this case.

However, I'd like to point out that the code you've shared (somehow?) requires Node.js to be at least version 18! The other bit of code in the comment you've linked might fit better, which goes like:

A good callout, though the other code snippet you selected also requires Node 18. 😓 From what I can tell, the version constraint stems from the signal option being passed to signal.addEventListener(), which tells it to automatically stop listening for events upon cancellation. That's a v18 feature, so we'd just need to alter the code to manually unsubscribe. This looks alright:

function anySignal(signals: AbortSignal[]) {
    const controller = new AbortController();
    const unsubscribe: (() => void)[] = [];

    function onAbort(signal: AbortSignal) {
        controller.abort(signal.reason);
        unsubscribe.forEach((f) => f());
    }

    for (const signal of signals) {
        if (signal.aborted) {
            onAbort(signal);
            break;
        }

        const handler = onAbort.bind(undefined, signal);
        signal.addEventListener('abort', handler);
        unsubscribe.push(() => signal.removeEventListener('abort', handler));
    }

    return controller.signal;
}

If I'm not misunderstanding, wouldn't the only way to achieve that be to essentially assign those functions twice, with that bit of code you've shared being right outside the class block? For the time being, I think I'm not that desperate for performance, but it's still a good thing to know!

You would still need to register the types separately on the class, but the actual assignment only needs to happen once after the class is defined. Here's a more complete example, that uses Object.assign() to avoid complaints from TypeScript due to the readonly modifier:

class API {
    readonly lookupBeatmap!: typeof Beatmap.lookup;
}

Object.assign(API.prototype, {
    lookupBeatmap: Beatmap.lookup,
    // more can go here
});

It's certainly not very elegant. If you don't mind it, then it's a potential opportunity for optimization, as it will reduce the complexity by moving all of those functions into the prototype, which is shared by all instances of API. I highly doubt the gains will be notable in any way, though.

TTTaevas commented 6 months ago

Thanks for your explanations! I've really gotten a better understanding of all those things now :D (or at least I feel like that's the case)

I've tried doing the way you've described with the ChildAPI, but it seems like all methods that use request will use the one defined in API instead of the one in ChildAPI; The request in ChildAPI is never called according to my testing!

I've created a branch called with-childapi which roughly implements it if you feel like checking for yourself I'll try to come up with something different (yet similar) in the coming days, I really want to solve this issue!

baileyherbert commented 6 months ago

Hmm, I didn't have any problems with your code. I can't really see how it wouldn't work, either. The only way that the endpoint methods can reference request is through this, which automatically points to the child object through runtime binding.

const api = new API({});
const child = api.with({ signal: new AbortController().signal });

async function test() {
  console.log(await child.getBeatmap(123));
}

test().catch(console.error);

Result:

You would see this if ChildAPI.request() was called
APIError {
  message: 'Unauthorized',
  server: 'https://osu.ppy.sh/api/v2',
  endpoint: 'beatmaps/123',
  parameters: {},
  status_code: 401,
  original_error: undefined
}

If you're still having issues, could you share the code that isn't working?

TTTaevas commented 6 months ago

Oh I'm dumb! I've made a really silly mistake!! Okay now my testing works, sorry for the false alarm! I'll keep you updated!