wolfSSL / wolfssh

wolfSSH is a small, fast, portable SSH implementation, including support for SCP and SFTP.
https://www.wolfssl.com
381 stars 88 forks source link

If the client is suspended, the server enters a busy loop with WS_WINDOW_FULL #607

Closed falemagn closed 1 year ago

falemagn commented 1 year ago

To reproduce on unix:

  1. start the echoserver
  2. connect with sftp from the command line
  3. start downloading a file from the server
  4. press CTRL+Z in the console

The echoserver will start spinning, printing out endless consecutive copies of this log snippet:

2023-10-20 17:08:51 [SFTP] SFTP buffer sent 31936 / 32781 bytes
2023-10-20 17:08:51 [DEBUG] Entering wolfSSH_get_error()
2023-10-20 17:08:51 [SFTP] Entering wolfSSH_SFTP_read()
2023-10-20 17:08:51 [DEBUG] Entering wolfSSH_stream_send()
2023-10-20 17:08:51 [DEBUG] Entering SendChannelData()
2023-10-20 17:08:51 [DEBUG] Entering ChannelFind(): self 0
2023-10-20 17:08:51 [DEBUG] Leaving ChannelFind(): 0x55ef3588de80
2023-10-20 17:08:51 [DEBUG] channel window is full
2023-10-20 17:08:51 [DEBUG] Leaving SendChannelData(), ret = -1073
2023-10-20 17:08:51 [DEBUG] Leaving wolfSSH_stream_send(), txd = -1073
2023-10-20 17:08:51 [SFTP] SFTP buffer sent 31936 / 32781 bytes
ejohnstown commented 1 year ago

How did you configure wolfSSH and which version are you using? I'm not getting the window full status.

I'm on macOS; I consider it BSD. I'm configuring with:

./configure --enable-sftp --enable-debug && make

I'm using the master branch (commit 173dfd9). I'm running the server with ./examples/echoserver/echoserver. The client, I'm using OpenSSH, sftp -P 22222 jill@localhost. I change directory to one where I have a multi GB file, and get muzak.mp3. I am able to suspend the client, repeatedly, and bring it back to the foreground multiple times, for dozens of sections per suspend. The echoserver picks right back up.

I also went back to the v1.4.14 release. I configured with:

./configure --enable-sftp --enable-debug CFLAGS="-DWOLFSSH_NO_SSH_RSA_SHA1 -DWOLFSSH_NO_AES_CTR"

I had to use those CFLAGS because the version of OpenSSH I'm using doesn't allow SHA-1 anything, and that version of wolfSSH has a bug with AES-CTR. I used the same sftp command connecting to the echoserver. I ran the echoserver without command line options, and it worked the same as the master branch. When I ran the echoserver with -N, the transfer failed when bring the sftp client back to the foreground. (Something about a buffer state failure. That needs to be looked at.)

falemagn commented 1 year ago

I did the test on the master branch (commit 74cf1d4), configured with --enable-static --disable-shared --enable-all --enable-debug --enable-examples. The issue manifests itself also with commit 173dfd9.

Ran the echoserver with -N -d / but also without -N the issue manifests itself.

I couldn't reproduce the issue with the v1.4.14 release either, but we were having this very issue in our product that uses v1.4.14 and I thought the trigger might be the different way our worker loop was built, compared to the echoserver's one; the master branch's echoserver has seen the worker loop refactored in a way more similar to the one we're using, and incidentally I could reproduce the very same issue there.

The fact you can't reproduce it with master is puzzling, to say the least.

Got any ideas?

ejohnstown commented 1 year ago

I moved my test to my Ubuntu box and ran it in a VM and recreated this. Taking a deeper look.

ejohnstown commented 1 year ago

Should be fixed now.

falemagn commented 1 year ago

Should be fixed now.

This might address the specific behavior in the echoserver, but is it normal at all for the library to hand to the library's user the duty to deal with a full window? Doesn't it rather mean that there's something to be "fixed" in the library itself?

The echoserver for us was just a proxy for an issue we're experiencing outside of the echo server, and it's not clear whether we could employ the same strategy as the echoserver's to deal with it.

On this topic, I've added a comment to the PR: https://github.com/wolfSSL/wolfssh/pull/612#issuecomment-1787836028

ejohnstown commented 1 year ago

but is it normal at all for the library to hand to the library's user the duty to deal with a full window? Doesn't it rather mean that there's something to be "fixed" in the library itself?

If you tell me I can only send you 128kb, and I send you 128kb, there's nothing I can do to send more until you tell me you've consumed any of the data I sent. That's the agreement when setting up the channel.

WS_WINDOW_FULL is analogous to WS_WANT_WRITE, but it is for a specific channel and not the socket.