theophilusx / ssh2-sftp-client

a client for SSH2 SFTP
Apache License 2.0
819 stars 200 forks source link

Issue running list command as background task #528

Closed nylon22 closed 7 months ago

nylon22 commented 8 months ago

The following code works as expected. I get back my data from my ftp server:

    const sftp = new SftpClient();

    await sftp.connect({ host, port, username, password });

    const resp = await sftp.list('/');

However, If I attempt to run the list command using the same connection and with the same search criteria as a background task using any one of process.nextTick, setImmediate, or @hapi's Bounce.background like so:

process.nextTick(async () -> {
    const resp = await sftp.list('/');
});

// OR

setImmediate(async () -> {
    const resp = await sftp.list('/');
});

// OR

Bounce.background(sftp.list(/'));

I get an error stating there is no response for my search criteria: "/" from the server

Error: list: No response from server /

Any help on this would be much appreciated. Thank you for the great lib!

Node version: v20.11.1 ssh2-sftp-client versions: 9.1.0, 10.0.3

theophilusx commented 8 months ago

First thing you need to do is upgrade to v10.0.3!

I don't understand why you are trying to run it as "a background task". The code is already using promises, so calls to the sftp methods are already executed asynchonously.

Only a guess, but I suspect what is happening is that the combination of putting the promise inside a block of timer code is resulting in a signficant slowdown/delay in execution and the sftp command is timing out. That error message is one you usually get if

  1. The connection to the remote server has been lost
  2. The remote server takes too long to respond and the timeout kicks in.

I suspect it is No. 2.

You might be able to verify this by setting a much larger timeout value in the connections object, though I suspect that timeout only relates to establishing the initial connection.

At the end of the day, I just wouldn't be trying to do what your doing. There is no benefit I can see and it will jsut complicate and possible slow down your code as now you have to do even more context switching.

nylon22 commented 8 months ago

Hey @theophilusx ,

Thanks for the feedback! I'm upgrading now :)

The reason I want to execute this list command as a background task is because it sometimes takes > 60s for my ftp server to return data. As a result, I want to allow clients to poll for this data. Rather than keeping a connection open to my server for 60+s, I want to respond to the client with 202s and a location header until my ftp server has responded. At that point, I will return the data the client is requesting.

I do not believe the issue is number 2. For the scenario I'm testing with, the server responds in < 5s with data when run outside of a background task. The server responds with the error in my OP also in < 5s when run as a background task.

Thanks again for your feedback and any direction you can provide!

theophilusx commented 8 months ago

Note that issue 2 is not about how long the server takes to respond but about confusion within node's execution context arising from the mixing of items between the task queue (timers) and microtask queue (promises). The result is that things don't get called in the correct order and it appears as though there has been no response from the server.

Note that when you call sftp.list(), it is executed asynchronously i.e. in the background. It is when you use await that it becomes synchronous. e.g.

let resp = await sftp.list()

it is the await doing that as it forces execution to wait until the Promise returned by sftp.list() is settled. You can just wait for that resolution at a later point in your code For example, you can just do

let resp = sftp.list() // do some stuff, like respond to client saying "Working...." await resp; // wait for list Promise to be resolved // do something with resp data

As an aside, a common mistake people make with await is they will do

let r1 = await slowFunc1(); let r2 = await slowFunc2();

instead of

let r1 = slowFunc1(); let r2 = slowFunc2();

await r1; await r2;

which will often be faster because both slowFunc1() and slowFunc2() are pushed onto the queue together allowing maximum concurrency exploit by the node runtime. For exmaple, while slowFunc1 is waiiting for disk I/O, node will switch to slowFUnc2 and then switch back to slowFunc1, efffectively interleaving execution of the two calls leading to faster overall completion where await slofunc1; await slowfunc2 makes the two run synchronously 1 after the other, so no interleaving or performance gain.

In your situation, I would probably use a promise chain as that has stronger guarantees regarding the timing/execution. For exmaple

sftp.list(..).then((listData) => { // do something with listData, perhaps setting a state flag // indicating data has been returned }); // do 202 +_ location info to client

The 'then' function will be executed asynchronously once the sftp.list() method has completed. This function can put the data into some state object and perhaps set a 'done' flag or similar.

nylon22 commented 8 months ago

hey @theophilusx . Thanks again for the feedback. I agree with your rationale. Unfortunately, I've only been able to get back data if I await the promise, even when using a promise chain:

// Returns data

  await sftp.list('/').then(async (nodesResp) => {
    // do something with data
  })
    .catch(error => {
      // log error
    });
// Error gets logged 

  sftp.list('/').then(async (nodesResp) => {
    // do something with data
  })
    .catch(error => {
      // log error
    });
theophilusx commented 8 months ago

There is an error in your code.

When using Promise chains, you don't use the async or await keywords. So your code should be

sftp.list("/") .then(data => { // do something with data, which is what is returned by // list() console.log(data); }) .catch (e => { consloe.log(e); })

While async/await and Promise chaines are both based on Promises, you don't mix the two together in the same bit of code. You might use both await/async and Promise chaines in the same application (for exmaple, look at the source code for ssh2-sftp-client, where some methods use promise chaines and some use async/await), but don't use them together at the same time.

The async keyword tells JS to put its return value inside a Promise object. However, in a Promise chain, the function is already returnig a Promise object. By using async, your causing JS to return a promise in a promise. You don't want that because it will cause your 'outer code' to fire when the outer promise is settled, which could be before the inner promise has settled.

Note also that you can build up long promise chains. All you need to ensure is that each 'then' block in the chain also returns a promise. For example, you could do something like

sftp.list('/some/path') .then(listing => { let firstFile = listing[0].name; return sftp.get(name, './download.txt'); }) .then(getResp => { console.log(getResp); return sftp.end(); }) .catch(e => { console.log(e); });

I feel you may be lacking some fundamental understanding of how JS and node work with respect to asynchronous code execution and the event loop. In particular, the relationship and differences between Promises and promise chains and async/await. I would strongly recommend you work on improving yuour understanding in these areas as it will definitely make things a lot easier. For one thing, a better understanding of this area will certianly make the app your building much easier to implement as it will provide much greater clarity on design and implementation.

You might find the following video useful

https://www.youtube.com/watch?v=i9lNaTQ6n9c

The original JS The hard Parts series is better, but I think it is only availble via Frontend Masters, which is a little expensive. If your someone who prefers reading rather than watching, the below free book is also a good resource.

https://github.com/KBPsystem777/You-Dont-Know-JS/blob/master/async&performance/README.md#you-dont-know-js-async--performance

nylon22 commented 7 months ago

Thanks @theophilusx , I understand awaiting a Promise chain is bad practice, but that was the only way I was able to get data back from my ftp server instead of an error when using the list command from this library. And for what I'm trying to do, I don't want to halt execution to await the result of the promise.

For anyone who has a similar use case, I was able to resolve this by using the readdir command from ssh2: https://github.com/mscdex/ssh2?tab=readme-ov-file#get-a-directory-listing-via-sftp

Thank you again for your time and feedback! @theophilusx