sindresorhus / got

🌐 Human-friendly and powerful HTTP request library for Node.js
MIT License
14.27k stars 935 forks source link

got() does not return when a response is in cache, but has been revalidated #2170

Closed Sigill closed 1 year ago

Sigill commented 1 year ago

Describe the bug

Actual behavior

I am trying to use Github's API, but enabling Got's cache causes some issues.

// You will need to export GITHUB_TOKEN=12345678... to avoid Github's limits.
import got from 'got';

const cache = new Map();

async function get() {
  const response = await got('https://api.github.com/repos/gcc-mirror/gcc/tags', {
    cache,
    cacheOptions: { shared: false },
    responseType: 'json',
    headers: process.env.GITHUB_TOKEN ? {'authorization': `token ${process.env.GITHUB_TOKEN}`} : {},
  });

  console.log(`response.complete:`, response.complete);
  console.log(`Got ${response.body.length} tags`);
}

(async () => {
  await get();

  console.log("Waiting for max-age to expire");
  await new Promise(resolve => setTimeout(resolve, 65000));
  console.log("Starting second request");

  await get();
})();

Output:

response.complete: true Got 30 tags Waiting for max-age to expire Starting second request

No issue with no cache.

While investigating, I found that Github's responses have the following header: cache-control: private, max-age=60, s-maxage=60.\ If the second request is done within 60s of the first, the result is better (we get a usable response), however some properties of the returned object (eg: .complete) are still missing.\ For example, reducing the delay between the two requests to 1s generates the following output:

response.complete: true Got 30 tags Waiting for max-age to expire Starting second request response.complete: undefined Got 30 tags

Expected behavior

got() should be able to work with responses retrieved from the cache, whether they have been revalidated or not.

Fixing .complete when using a cache has already been attempted in the past: https://github.com/sindresorhus/got/commit/9e15d887da3b065940bbc8ca38f9c748a0bbc75e.

Code to reproduce

See above.

Checklist

zkx5xkt commented 1 year ago

After some investigation I have found that the problem occurs when cache revalidation with turned on compression results in a 304 status code. (I have added a test for the case in this pr)

It seems like the problem stems from a questionable workaround in mimic-response (manually emitting lifetime events of Stream to which node.js official documentation advises against) that is used by got dependencies: decompress-response, cacheable-request.

I think that the aforementioned hack combined with the partial implementation of the http.IncomingMessage by responselike (another package used by cacheable-request and got) results in the behavior we're observing.

Removing that workaround fixed the problem for me. However, I'm not sure about the reasoning behind it and if removing it would break something else. I think that the best solution would be to rewrite the logic behind the workaround to something that would not need the manual publishing of Stream events.

However, there's another workaround I've found that works locally for me (alas not so cleanly): backfill the complete field to the Response class of the responselike package like that:

diff --git a/index.js b/index.js
index 6e2d40b..ff2d6a9 100644
--- a/index.js
+++ b/index.js
@@ -28,6 +28,7 @@ export default class Response extends ReadableStream {
            read() {
                this.push(body);
                this.push(null);
+               this.complete = true;
            },
        });

@@ -35,5 +36,6 @@ export default class Response extends ReadableStream {
        this.headers = lowercaseKeys(headers);
        this.body = body;
        this.url = url;
+       this.complete = false;
    }
 }

I'm mentioning a change to another packages here because it looks like it's maintained by the same group of people.

Sigill commented 1 year ago

I also reached the conclusion that compression might be the root cause, making it probably the same problem as issue https://github.com/sindresorhus/got/issues/2105.

I managed to reproduce it in some unit tests, in this MR: https://github.com/sindresorhus/got/pull/2171. It also includes tests using Github's API (even though they are quite long due to the 60s max-age).

Jimimaku commented 1 year ago

Describe the bug

  • Node.js version: 18.12.0
  • OS & version: Debian 11

Actual behavior

I am trying to use Github's API, but enabling Got's cache causes some issues.

// You will need to export GITHUB_TOKEN=12345678... to avoid Github's limits.
import got from 'got';

const cache = new Map();

async function get() {
  const response = await got('https://api.github.com/repos/gcc-mirror/gcc/tags', {
    cache,
    cacheOptions: { shared: false },
    responseType: 'json',
    headers: process.env.GITHUB_TOKEN ? {'authorization': `token ${process.env.GITHUB_TOKEN}`} : {},
  });

  console.log(`response.complete:`, response.complete);
  console.log(`Got ${response.body.length} tags`);
}

(async () => {
  await get();

  console.log("Waiting for max-age to expire");
  await new Promise(resolve => setTimeout(resolve, 65000));
  console.log("Starting second request");

  await get();
})();

Output:

response.complete: true Got 30 tags Waiting for max-age to expire Starting second request

No issue with no cache.

While investigating, I found that Github's responses have the following header: cache-control: private, max-age=60, s-maxage=60. If the second request is done within 60s of the first, the result is better (we get a usable response), however some properties of the returned object (eg: .complete) are still missing. For example, reducing the delay between the two requests to 1s generates the following output:

response.complete: true Got 30 tags Waiting for max-age to expire Starting second request response.complete: undefined Got 30 tags

Expected behavior

got() should be able to work with responses retrieved from the cache, whether they have been revalidated or not.

Fixing .complete when using a cache has already been attempted in the past: 9e15d88.

Code to reproduce

See above.

Checklist

  • [x] I have read the documentation.
  • [x] I have tried my code with the latest version of Node.js and Got.
szmarczak commented 1 year ago

The underlying cache package isn't maintained anymore. It will be replaced.