calculateDelay causes retry opts to be ignored and an infinite retry loop

daverickdunn commented 2 years ago

Describe the bug

Node.js version: 14.?.?, 16.15.0
OS & version: Amazon Linux 2 (Lambda), Win 11

I want to retry 403's n times, but not other status codes. This works fine when I omit calculateDelay, but will cause an infinite(?) retry loop for all status codes when I include it.

I'm testing against a 404 below, but I know it also happens for 400.

Actual behavior

Retries all errors indefinitely.

Expected behavior

Does not retry unspecified errors.

Code to reproduce


import got from 'got';

const client = got.extend({
    retry: {
        limit: 0
    }
});

async function mytest() {
    const result = await client.get('https://jsonplaceholder.typicode.com/make-a-404', {
        resolveBodyOnly: true,
        responseType: 'json',
        retry: {
            limit: 3,
            statusCodes: [
                403
            ],
            errorCodes: [],
            calculateDelay: () => 1000, // causes infinite(?) retries
        },
        hooks: {
            beforeRetry: [
                (error, retryCount) => {
                    console.log(`Retrying [${retryCount}]: ${error.code}`);
                    // Retrying [1]: ERR_NON_2XX_3XX_RESPONSE
                }
            ]
        }
    });
    console.log(result)
}

mytest()

Output

  Retrying [1]: ERR_NON_2XX_3XX_RESPONSE
  Retrying [2]: ERR_NON_2XX_3XX_RESPONSE
  // ...
  Retrying [36]: ERR_NON_2XX_3XX_RESPONSE
  Retrying [37]: ERR_NON_2XX_3XX_RESPONSE

Checklist

[✅] I have read the documentation.
[✅] I have tried my code with the latest version of Node.js and Got.

szmarczak commented 2 years ago

https://github.com/sindresorhus/got/blob/main/documentation/7-retry.md#calculatedelay

I want to retry 403's n times, but not other status codes.

if (error?.response.statusCode === 403 && attemptCount < N) {
    return 1000;
}
return calculatedDelay;

daverickdunn commented 2 years ago

I'm sorry but your reply is even less useful than the docs. Where is that code supposed to be placed?

Are you trying to tell me that the function called calculateDelay doesn't just calculate a delay, but instead overrides the entire retry config until you return 0? If so, is that code supposed to be inside calculateDelay, then how are you getting a handle on the error object?

szmarczak commented 2 years ago

Where is that code supposed to be placed?

The calculateDelay function.

how are you getting a handle on the error object?

It's passed as a property of the first argument which is a retry detail object. You can destructure like this:

({error, attemptCount, calculatedDelay}) => { ... }

the function called calculateDelay doesn't just calculate a delay, but instead overrides the entire retry config until you return 0

It can override the retry config if you code it to do so. It calculates the delay based on the config. As per docs, 0 cancels the retry.

daverickdunn commented 2 years ago

It can override the retry config if you code it to do so. It calculates the delay based on the config. As per docs, 0 cancels the retry.

But it always does override the retry config, that's why I opened this issue. You have to explicitly code it to recreate the conditions that are determined by limit, statusCodes, errorCodes etc. They're all overridden by a function called calculateDelay, that's a very confusing side effect and not a great user experience.

The calculateDelay function should either:

A: calculate a delay, and nothing else.
B: be renamed to something like overrideRetry.

Assuming you do want a function that overrides everything in the retry config, it's a little strange that this function resides within the retry options. I think it would be much clearer if the retry property itself were overloaded to accept RetryOptions | RetryFunction.

Hopefully you can see, at a minimum, the naming of that function is confusing.

szmarczak commented 2 years ago

But it always does override the retry config,

Again, see my comment:

It can override the retry config if you code it to do so.

You have to explicitly code it to recreate the conditions

Why do you want to reinvent the wheel? What's the reason against using computedValue instead?

The calculateDelay function should either

All it does it calculates the delay. If 0, it cancels the retry. It CAN be used to override the retry config, but does NOT have to be. It depends on how you code this function.

it's a little strange that this function resides within the retry options

It does make sense. Retry config, calculate retry delay function. Retry, retry - all in one place. The function accepts the config as input as well.

If you do not like how this was designed, feel free to use something else.

szmarczak commented 2 years ago

Also see https://github.com/sindresorhus/got/issues/1277#issuecomment-633123144 - maybe this will help you

daverickdunn commented 2 years ago

If you do not like how this was designed, feel free to use something else.

If you can't handle constructive criticism without getting passive-aggressive then maybe open-source just ain't for you. Seriously.

Why do you want to reinvent the wheel? What's the reason against using computedValue instead?

Because it's an unexpected side-effect of a function called computeDelay. You wouldn't write a function called multiply that actually deletes a file, would you? When I use that function it does so much more that calculate a delay and makes all the other retry options redundant, that's just not well designed, sorry.

szmarczak commented 2 years ago

If you can't handle constructive criticism without getting passive-aggressive then maybe open-source just ain't for you. Seriously.

You are always welcome to contribute or become a competition. You're trying to force retry behavior change, which is very unlikely to happen. There are other https://github.com/sindresorhus/got#comparison HTTP clients, some are simpler to use (but less featured). You just don't have to go with Got. You can though. That's what I said, that is not passive aggressive. There is a reason the calculateDelay was designed like this. This is how the algorithm works:

An error is caught: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L307
Then we check whether a retry handler is attached (it's attached automatically when using promises, but not when using streams): https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L341
The retry-after header is read: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L346-L357
Default computed value is calculated: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L366-L372
User provided calculateDelay is called: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L361-L373
Then we check if the returned value is truthy: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L379
Retry is done: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L380-L419
If 0 was returned, throw: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/index.ts#L423

The calculateDelay function is called every time an error is caught. If you always return 1000, then it will always wait 1000 ms and make a new retry. That's just how you coded it. The default implementation looks like this (it's also in the docs):

https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/options.ts#L759

So Got does its logic and passes the value to the user provided function. If you want to cancel a retry, simply return 0.

If computedValue is non-zero, then the default behavior is to make a retry. If it is a zero, then the default behavior is not to make a retry. Very simple. Knowing that you could just write:

if (computedValue) {
    // override the default retry delay
    return 1000;
}

// go with the default behavior = do not retry
return 0;

That doesn't override the default logic, it just overrides the delay in case a retry happens. So my statement that calculateDelay enables override, but does not enforce it is true.

I wouldn't say it's overcomplicated. Please take a look at the default logic: https://github.com/sindresorhus/got/blob/f21632d08a094fa86db21fee5b9d56d4fd5436cd/source/core/calculate-retry-delay.ts#L5-L44 you'll see that it is a bit sophisticated. The reason caluclateDelay was designed like this is to enable partial overrides. Someone may want to retry immediately (return 1) on DNS errors and proceed with the default on other errors. Having two functions, one for enabling retries and the other for calculating the delay in case a retry happens would just duplicate code. Example:

shouldRetry({error, computedValue}) {
    if (error.code === 'EFOOBAR') {
        return true;
    }

    if (error.code === 'ENEVERRETRY') {
        return false;
    }

    return computedValue;
},
calculateRetryDelay({error, computedValue}) {
    if (error.code === 'EFOOBAR') {
        return 1000;
    }

    return computedValue;
},

See the duplicated ifs? To prevent duplicated code, you can just write:

calculateDelay({error, computedValue}) {
    if (error.code === 'EFOOBAR') {
        return 1000;
    }

    if (error.code === 'ENEVERRETRY') {
        return 0;
    }

    return computedValue;
}

szmarczak commented 2 years ago

We even have a note that states:

This function is responsible for the entire retry mechanism, including the limit property. To support this, you need to check if computedValue is different than 0.

So it was all in the docs. I don't think we need to go further with the conversation, so I'll just lock the issue.

sindresorhus / got