tomas / needle

Nimble, streamable HTTP client for Node.js. With proxy, iconv, cookie, deflate & multipart support.
https://www.npmjs.com/package/needle
MIT License
1.62k stars 235 forks source link

`get` with redirect leads to ~30sec hang #435

Open stephematician opened 3 months ago

stephematician commented 3 months ago

I'm trying to download a file (or other 'get' requests) which have redirects, but they seem to hang for some time. E.g. let's say I use a promise to download a file:

function get_file(url: string, file_path: string) : Promise<void> {

    return new Promise<void>((resolve, reject) => {
        needle.get(
            url, { follow_max: 1, output: file_path },
        (error) => {
            if (error) { reject(error); }
            resolve();
        })
    }); 

}

async function foo() : Promise<void> {
    await get_file(
        "https://github.com/tomas/needle/archive/refs/tags/v3.2.0.zip",
        "foo.zip"
    );
    console.log("So we should be done here.");
}

foo();
// > So we should be done here.
// then a 30 second delay (roughly) before the program exits

The only way around this that I've had luck with, so far, is to attach a signal to the request and abort it after the file has downloaded.

It seems to be an issue when following? Because if I download a file that doesn't require a redirect, there isn't any delay before the program ends.

stephematician commented 3 months ago

Can get the same behaviour with a simple change to example/download-to-fiile.js:

var fs = require('fs'),
    needle = require('needle'),
    path = require('path');

var url  = process.argv[2] || "https://github.com/tomas/needle/archive/refs/tags/v3.2.0.zip";
var file = path.basename(url);

console.log('Downloading ' + file);

needle.get(url, { output: file, follow: 3 }, function(err, resp, data){
  console.log('File saved: ' + process.cwd() + '/' + file);

  var size = fs.statSync(file).size;
  if (size == resp.bytes)
    console.log(resp.bytes + ' bytes written to file.');
  else
    throw new Error('File size mismatch: ' + size + ' != ' + resp.bytes);
});
// > Downloading v3.2.0.zip
// > File saved: /home/username/v3.2.0.zip
// > 79318 bytes written to file.
// then ~30sec wait til exit
stephematician commented 2 months ago

@Afam007 You could not reproduce? What system/node are you using?

stephematician commented 2 months ago

Thanks, but I'll wait and see if others can reproduce given the MWE I provided.

stephematician commented 2 months ago

I tested with a non-github file that also requires a redirect, the results are the same, e.g. substituting the following for url http://www.vim.org/scripts/download_script.php?src_id=9750. The file downloads, and then the program waits for about 30 seconds before exiting.

I do not observe the same behaviour with ky, e.g. with a simple test module:

import ky from "ky";

var file_url  = process.argv[2] || "http://www.vim.org/scripts/download_script.php?src_id=9750"

async function foo(file_url) { const response = await ky(file_url); }

foo(file_url);
// Exits promptly - although note I have not written the output to file, but I suspect
// that would not make a difference.
stephematician commented 2 months ago

Also works without 30sec hang using fetch:

const fs = require('fs');
const { Readable } = require('stream');
const { finished } = require('stream/promises');

async function foo() {
const stream = fs.createWriteStream('v3.2.0.zip');
const { body } = await fetch('https://github.com/tomas/needle/archive/refs/tags/v3.2.0.zip');
await finished(Readable.fromWeb(body).pipe(stream));
}

foo();

As it stands, I don't think it makes sense that the ISP would be at fault and that a proxy/VPN is needed to download. All the other approaches work from my machine on the ISP. No other application hangs after a file is downloaded like needle does.

It is also not only limited to files (content type). Other content types that require a redirect have the same behaviour.

stephematician commented 2 months ago

Lastly: here's some output that shows one of the requests (the original one) doesn't close even though I think request.end() should have been called within needle's send_request.

var fs = require('fs'),
    needle = require('needle'),
    path = require('path');

var url  = process.argv[2] || "https://github.com/tomas/needle/archive/refs/tags/v3.2.0.zip";
var file = path.basename(url);

console.log('Downloading ' + file);

x = needle.get(url, { output: file, follow: 3 }, function(err, resp, data){
  console.log('File saved: ' + process.cwd() + '/' + file);

  var size = fs.statSync(file).size;
  if (size == resp.bytes)
    console.log(resp.bytes + ' bytes written to file.');
  else
    throw new Error('File size mismatch: ' + size + ' != ' + resp.bytes);
});
let start = new Date();

x.on("response", () => {
    console.log(`response (x) at ${(new Date()).getTime() - start.getTime()}`);
});
x.on("close", () => {
    console.log(`close (x) at ${(new Date()).getTime() - start.getTime()}`);
});
x.request.on("response", () => {
    console.log(`response (x.req) at ${(new Date()).getTime() - start.getTime()}`);
});
x.request.on("timeout", () => {
    console.log(`timeout (x.req) at ${(new Date()).getTime() - start.getTime()}`);
});
x.request.on("close", () => {
    console.log(`close (x.req) at ${(new Date()).getTime() - start.getTime()}`);
});

With output:

Downloading v3.2.0.zip
response (x) at 385
response (x.req) at 389
response (x) at 790
close (x) at 823
File saved: /home/stephematician/Documents/temp/test_needle/v3.2.0.zip
79318 bytes written to file.
timeout (x.req) at 5384
close (x.req) at 30384
stephematician commented 2 months ago

Potentially related: https://github.com/nodejs/node/issues/47228