sshnet / SSH.NET

SSH.NET is a Secure Shell (SSH) library for .NET, optimized for parallelism.
http://sshnet.github.io/SSH.NET/
MIT License
3.94k stars 927 forks source link

Sshclient deadlock/freeze on disconnect #355

Closed gary-holland closed 4 years ago

gary-holland commented 6 years ago

Hi,

My code is locking when attempting to call the Disconnect function.

This snippet is an example of what is locking:

SshClient client = new SshClient("server", "user", "pass");
client.Connect();
client.Disconnect(); # freezes here

I'm connecting to a ssh connection running on a QNAP NAS server, with the server version listed as "SSH-2.0-OpenSSH_7.6".

This is on OSX (High Sierra). I just tried in Windows, and the issue doesn't exist there.

Any help would be much appreciated.

Thanks.

keyant commented 6 years ago

I also encountered the same problem ( on OSX High Sierra, too).

I found the code is locked by this line: _messageListenerCompleted.WaitOne(); in the method Disconnect() of Session.cs.

oktinaut commented 6 years ago

The problem occurs on Sierra as well (using .NET Core 2.0).

HalfLegend commented 6 years ago

Same issue, also OSX High Sierra Not only sshClient, scpClient also have the issue.

Any update?

HalfLegend commented 6 years ago

Now I am using the following code to avoid the issue:

                Task.Factory.StartNew(() => {
                    sshClient.Dispose();
                });

and kill these threads later.

Though the threads are blocked, the socket resources are in fact freed successfully before _messageListenerCompleted.WaitOne() is called.

JulianRooze commented 6 years ago

We run into the same issue, but on Linux. Can confirm that it's waiting on _messageListenerCompleted.WaitOne() inside Disconnect.

We have a wrapper class around the SshClient and we worked around the issue by disconnecting the socket through reflection if necessary after a 2 second delay and setting the waithandle. The dispose method of our wrapper class:

public void Dispose()
{
  if (_client == null) return;

  Task.Run(() =>
  {
    _log.Debug("Disposing _client");

    var timer = new System.Timers.Timer();

    timer.Interval = 2000;
    timer.AutoReset = false;

    timer.Elapsed += (s, e) =>
    {
      try
      {
        var sessionField = _client.GetType().GetProperty("Session", BindingFlags.NonPublic | BindingFlags.Instance);

        if (sessionField != null)
        {
          var session = sessionField.GetValue(_client);

          if (session != null)
          {
            var socketField = session.GetType().GetField("_socket", BindingFlags.NonPublic | BindingFlags.Instance);

            if (socketField != null)
            {
              var socket = (Socket)socketField.GetValue(session);

              if (socket != null)
              {
                _log.Debug($"Socket state: Connected = {socket.Connected}, Blocking = {socket.Blocking}, Available = {socket.Available}, LocalEndPoint = {socket.LocalEndPoint}, RemoteEndPoint = {socket.RemoteEndPoint}");

                _log.Debug("Set _socket to null");

                try
                {
                  socket.Dispose();
                }
                catch (Exception ex)
                {
                  _log.Debug("Exception disposing _socket", ex);
                }

                socketField.SetValue(session, null);
              }
              else
              {
                _log.Debug("_socket was null");
              }
            }

            var messageListenerCompletedField = session.GetType().GetField("_messageListenerCompleted", BindingFlags.NonPublic | BindingFlags.Instance);

            var messageListenerCompleted = (EventWaitHandle)messageListenerCompletedField.GetValue(session);

            if (messageListenerCompleted != null)
            {
              var waitHandleSet = messageListenerCompleted.WaitOne(0);

              _log.Debug($"_messageListenerCompleted was set = {waitHandleSet}");

              if (!waitHandleSet)
              {
                _log.Debug($"Calling Set()");
                messageListenerCompleted.Set();
              }
            }
            else
            {
              _log.Debug("_messageListenerCompleted was null");
            }
          }
          else
          {
            _log.Debug("Session was null");
          }
        }
      }
      catch (Exception ex)
      {
        _log.Debug($"Exception in Timer event handler", ex);
      }
    };

    timer.Start();

    _client.Dispose();

    _log.Info("Disposed _client");
  });
}

A typical log when it fails to disconnect:

image

But it usually disconnects just fine, not disconnecting is the exception. The ratios are 20 failures for 700 successful disconnects in about 12 hours.

JulianRooze commented 6 years ago

I should add that it happens much more often that _socket is already null when disconnecting hangs, and only _messageListenerCompleted.Set() wasn't called yet. So the log usually says this when the disconnect fails:

image

So the _socket still being connected is the exception.

bloudraak commented 6 years ago

I have the same issue using .NET Core 2.0 on macOS Sierra 10.12.6 (16G1212)

using (var client = new SshClient(host, username, password))
                {

                    client.Connect();
                    var command = client.CreateCommand("df -H");
                    var result = command.Execute();
                    Console.WriteLine(result);
                    client.Disconnect();
                }
neseih commented 6 years ago

Just ran into the same issue on Ubuntu 16.04, dotnet core 2.1

jolosimeon commented 6 years ago

Ran into the same problem on macOS High Sierra, 10.13.5, dotnet core version 2.1.300 It fails to disconnect and also hangs at the end of "using (var client = new SftpClient(...))"

drieseng commented 6 years ago

What version of SSH.NET are you using? Can you easily reproduce this issue, or does it only happen on some occasion?

JulianRooze commented 6 years ago

@drieseng We use the latest (2016.1.0). From our logs, in about 4% of cases it hasn't set the waithandle (see our workaround code above) so it would wait indefinitely. Ever since upgrading to .NET Core 2.1 we haven't seen the cases where the socket was still connected, so I think that's solved by that upgrade, but it was really rare before so it might be a coincidence. I think the occurrence of the issue is more or less random and probably a timing thing.

HalfLegend commented 6 years ago

@drieseng Any update? It could be easily reproduced on mac os. Just open a ssh client and close it

cvallance commented 6 years ago

@drieseng - Getting it on macOS High Sierra dotnet core v2.1.301. Using the latest nuget package 2016.1.0. Easily reproducible... just open a connection and try to call Disconnect.

drieseng commented 6 years ago

I don't have a mac :( Can you reproduce it on Linux?

cvallance commented 6 years ago

Just tried to reproduce it in docker and couldn't... which I guess is a good thing because that's what we're using outside of our dev environments.

Still, would be good to have it fixed for dev purposes. I'll see if I can have a crack at fixing but honestly, I wouldn't know where to start 😄

cvallance commented 6 years ago

So when I slowly step through the code it doesn't hang... I changed all the bunch of DiagnosticAbstraction.Log to simple Console.WriteLine and got the following.

This is when it does hang: screen shot 2018-07-20 at 10 02 25 am

This is when it does NOT hang: screen shot 2018-07-20 at 10 00 42 am

cvallance commented 6 years ago

And this is from the docker container (which doesn't hang):

screen shot 2018-07-20 at 10 47 07 am
Spica92 commented 6 years ago

Hi, I need some help : I can connect correctly but any command sent, hang indefinitely. I tried access via putty and it works fine. I tried to set a sleep between connection and command but result is the same. If anyone has a clue it would be nice. Renci.Ssh.net version 1.0.1.0 Thanks KeyboardInteractiveAuthenticationMethod kauth = new KeyboardInteractiveAuthenticationMethod("integrator"); kauth.AuthenticationPrompt += new EventHandler<Renci.SshNet.Common.AuthenticationPromptEventArgs>(HandleKeyEvent); PasswordAuthenticationMethod pauth = new PasswordAuthenticationMethod("integrator", "myPassword"); ConnectionInfo connectionInfo = new ConnectionInfo("10.2.112.00", 22, "integrator", pauth, kauth); SshClient sshClient = new SshClient(connectionInfo); SshCommand cmd = sshClient.CreateCommand($"xFeedback register /Status/Audio"); string s =cmd.Execute(); // <-- hangs here for ever sshClient.Disconnect();

it fails here in SshCommand.cs private void WaitOnHandle(WaitHandle waitHandle) { var waitHandles = new[] { _sessionErrorOccuredWaitHandle, waitHandle };

        switch (WaitHandle.WaitAny(waitHandles, CommandTimeout))  **// WaitHandle nerver raised**
        {
            case 0:
                throw _exception;
            case WaitHandle.WaitTimeout:
                throw new SshOperationTimeoutException(string.Format(CultureInfo.CurrentCulture, "Command '{0}' has timed out.", CommandText));
        }
    }
drieseng commented 6 years ago

@Spica92 Please create a separate issue for this problem. Include as much details as possible (OS, SSH server, exact version of SSH.NET, …).

drieseng commented 6 years ago

@cvallance From your traces, it looks as if .NET Core doesn't break out of the Socket.Select(IList checkRead, IList checkWrite, IList checkError, int microSeconds) call in Session.MessageListener() when the socket is disposed. I'll try to have a closer look tomorrow.

You can always add some CWLs before and after the Socket.Select(IList checkRead, IList checkWrite, IList checkError, int microSeconds) call, and in the finally block.

drieseng commented 6 years ago

I've been able to reproduce this issue on Linux. It appears to be a bug (regression?) in .NET Core 2.1. Filed as https://github.com/dotnet/corefx/issues/31368

Calvfre commented 6 years ago

Thanks for identifying the bug. Until it is fixed, is there a suggested / preferred work around we should use?

JulianRooze commented 6 years ago

@Calvfre the workaround code I posted above has been working well for us for a few months now. Nasty reflection, but it works.

mc-denisov commented 6 years ago

.NET say use Shutdown() to prevent endless hang: https://github.com/dotnet/corefx/pull/26898

Calvfre commented 6 years ago

@mc-denisov If I am looking at this correctly, In the session class, shutdown is being used. "_socket.Shutdown(SocketShutdown.Send);"

Calvfre commented 6 years ago

@JulianRooze I must be doing something wrong. I attempted to create a wrapper for sshClient. I inherited from SshClient, Overrode Dispose with your example. I am still having issues attempting to figure out your "_client". In all my attempts so far, disposed is never being called, when I call disconnect from my code. Note, In my main code, I am declaring my client with Using. If I do not call disconnect, then exiting the using statement will call the Overridden dispose code. Declaration: public class WrSshClient : SshClient { ... using (WrSshClient client = new WrSshClient(port.BoundHost, (int)port.BoundPort, username,keyfile)) { client.Connect(); ... client.Disconnect(); }

mc-denisov commented 6 years ago

@drieseng prepare example: https://github.com/dotnet/corefx/issues/31368#issue-344579454 for System.Net.Socket. If the example corrected as follows:

...
        Console.WriteLine("Disposing socket...");
-       client.Dispose();
+       client.Shutdown(SocketShutdown.Receive);
+       client.Dispose();
...

then disconnect occurs propertly.

JulianRooze commented 6 years ago

@Calvfre Sorry if my example isn't clear, our wrapper doesn't inherit from SshClient, it's just a class implementing IDisposable. The _client member is also an instance of SftpClient, not SshClient, but I've checked and both of those inherit from BaseClient and so both have a Session member that my workaround code depends on.

uffebjorklund commented 6 years ago

For me this issue is resolved by replacing

_socket.Shutdown(SocketShutdown.Send); with _socket.Shutdown(SocketShutdown.Both);

in the SocketDisconnectAndDispose in Session.cs

EDIT: I really need SFTP for .NET CORE today, so I forked and created a temporary package on nuget. Will switch back to the real package when this issue has been solved. https://www.nuget.org/packages/SSH.NET.Fork/2018.8.25.2

darkguy2008 commented 6 years ago

The code is just 1 line change, can we just merge/create a PR and fix this? I also need this to work...

deboshy commented 6 years ago

Hm, @uffebjorklund I try use your SSH.NET.Fork package and this issue not solved, connection stuck on moment dispose\disconnect client.

using (var client = new SshClient(Connection(dhcpServer, "root")))
{
    client.Connect();
    var command = client.CreateCommand("df -H").Execute();

    // 1. Disconnect only
    client.Disconnect();

    // 2. Dispose with Disconnect
    client.Dispose();
    client.Disconnect();

    // 3. Dispose method by @JulianRooze
    Dispose(client);
    client.Disconnect();
}

No one of these ways does not work. May be i try not correctly, but i run 5 tasks, each by 30 sec, and my server is hung after 6 hours due to lack of memory.. This is very critical for work.

Does anyone have a working solution or another library?

uffebjorklund commented 6 years ago

We run our simple fix in docker (linux alpine) and it works well in there. What OS are you running on @asphyxiatedx ?

Edit: We do this...

var connectionInfo = new ConnectionInfo(this.Settings.Host, this.Settings.Port, this.Settings.UserName, new PasswordAuthenticationMethod(this.Settings.UserName, this.Settings.Password));
this.client = new SftpClient(connectionInfo);

// later on...
if(!client.IsConnected)            
{
    client.Connect();                
}

using(var uploadStream = File.OpenRead(filename))
{
    if(this.Settings.RemoteDirectory.Trim().Length > 0)
    {
        client.ChangeDirectory(this.Settings.RemoteDirectory);
    }
    client.UploadFile(uploadStream,fi.Name,true);                
}   
this.client.Disconnect();
deboshy commented 6 years ago

@uffebjorklund thank you for reply. I use Mac OS High Sierra - 10.13.6 - x64 dotnet version - 2.1.401

I try this:

var connectionInfo = new ConnectionInfo(server, port, user, new PrivateKeyAuthenticationMethod(user, new PrivateKeyFile(key, password)));
var client = new SshClient(connectionInfo);
client.Connect();
if(client.IsConnected)
{
    var command = client.CreateCommand("df -H").Execute();
    client.Disconnect();
}

And this solution does not work.. Connection freeze on first or second try

darkguy2008 commented 6 years ago

@asphyxiatedx make a class that inherits from IDisposable and use @JulianRooze 's code. It worked perfectly fine for me.

deboshy commented 6 years ago

@darkguy2008 can you show an example please?

darkguy2008 commented 6 years ago

@asphyxiatedx here's my code, it's not rocket science anyways:

using System;
using System.Net.Sockets;
using System.Reflection;
using System.Threading;
using System.Threading.Tasks;
using Renci.SshNet;

namespace api_users
{
    public class SSHWrapper : IDisposable
    {
        public SshClient Client;

        public SSHWrapper(ConnectionInfo connectionInfo)
        {
            Client = new SshClient(connectionInfo);
        }

        public void Dispose()
        {
            if (Client == null) return;
            Task.Run(() =>
            {
                var timer = new System.Timers.Timer();
                timer.Interval = 2000;
                timer.AutoReset = false;
                timer.Elapsed += (s, e) =>
                {
                      try
                      {
                          var sessionField = Client.GetType().GetProperty("Session", BindingFlags.NonPublic | BindingFlags.Instance);
                          if (sessionField != null)
                          {
                              var session = sessionField.GetValue(Client);
                              if (session != null)
                              {
                                  var socketField = session.GetType().GetField("_socket", BindingFlags.NonPublic | BindingFlags.Instance);
                                  if (socketField != null)
                                  {
                                      var socket = (Socket)socketField.GetValue(session);
                                      if (socket != null)
                                      {
                                          try
                                          {
                                              socket.Dispose();
                                          }
                                          catch (Exception ex)
                                          {
                                          }
                                          socketField.SetValue(session, null);
                                      }
                                  }

                                  var messageListenerCompletedField = session.GetType().GetField("_messageListenerCompleted", BindingFlags.NonPublic | BindingFlags.Instance);
                                  var messageListenerCompleted = (EventWaitHandle)messageListenerCompletedField.GetValue(session);
                                  if (messageListenerCompleted != null)
                                  {
                                      var waitHandleSet = messageListenerCompleted.WaitOne(0);
                                      if (!waitHandleSet)
                                          messageListenerCompleted.Set();
                                  }
                              }
                          }
                      }
                      catch (Exception ex)
                      {
                      }
                  };

                timer.Start();
                Client.Dispose();
            });
        }

    }
}
deboshy commented 6 years ago

@darkguy2008 thank you for reply. i copy your class and try run tasks, and it did not bring results..

var connectionInfo = new ConnectionInfo(server, port, user, new PrivateKeyAuthenticationMethod(user, new PrivateKeyFile(key, password)));
var test = new SSHWrapper(connectionInfo);
var client = test.Client;
client.Connect();
f(client.IsConnected)
{
    var command = client.CreateCommand("df -H").Execute();
    test.Dispose();
    client.Disconnect();
    client.Dispose();
}

Can I misuse the class?

If I call client.Disconnect before client.Dispose, everything hangs. If I call test.Dispose before client.Disconnect, then the client does not Disconnect and is not Dispose, it has no effect, everything hangs. If I call test.Dispose then client.Disconnect and then client.Dispose, everything hangs.

deboshy commented 6 years ago

I try this method:

client.Connect();
if(client.IsConnected)
{
    client.CreateCommand("df -H").Execute();

    try
    {
        var sessionField = client.GetType().GetProperty("Session", BindingFlags.NonPublic | BindingFlags.Instance);
        if (sessionField != null)
        {
            var session = sessionField.GetValue(client);
            if (session != null)
            {
                var socketField = session.GetType().GetField("_socket", BindingFlags.NonPublic | BindingFlags.Instance);
                if (socketField != null)
                {
                    var socket = (Socket)socketField.GetValue(session);
                    if (socket != null)
                    {
                        try
                        {
                            socket.Dispose();
                        }
                        catch (Exception ex)
                        {
                        }
                        socketField.SetValue(session, null);
                    }
                }

                var messageListenerCompletedField = session.GetType().GetField("_messageListenerCompleted", BindingFlags.NonPublic | BindingFlags.Instance);
                var messageListenerCompleted = (EventWaitHandle)messageListenerCompletedField.GetValue(session);
                if (messageListenerCompleted != null)
                {
                    var waitHandleSet = messageListenerCompleted.WaitOne(0);
                    if (!waitHandleSet) messageListenerCompleted.Set();
                }
            }
        }
    }
    catch (Exception ex)
    {
    }

    client.Dispose();
}

And before call client.Dispose in debug I see that the connection is Disconnect, then call client.Dispose and client is Disposed! But in the OS System Monitor on process dotnet i see, ports and threads increases with each pass... Means resources are not exempted and Disposed does not work.

baotn166 commented 5 years ago

Hi all,

Anyone can tell me when this problem will be fixed? I can do a workaround as @JulianRooze said.

Thank you

bbhoss commented 5 years ago

Hi all,

Anyone can tell me when this problem will be fixed? I can do a workaround as @JulianRooze said.

Thank you

It looks like it has been fixed in .NET Core, so hopefully it will be resolved in the next release.

CodingInfinite commented 5 years ago

Downgrading to version 2016.0.0 also fixed the problem.

baotn166 commented 5 years ago

@CodingInfinite I'm not sure that the SSH.NET version 2016.0.0 can run on dotnet core 2.1. Let's me try.

loop-evgeny commented 5 years ago

Downgrading to version 2016.0.0 also fixed the problem.

Confirmed! Now if only some brave soul would figure out why 2016.0.0 works and port the fix/workaround from that to the latest version (if possible).

MRhyne1931 commented 5 years ago

Downgrading to version 2016.0.0 also fixed the problem.

Confirmed! Now if only some brave soul would figure out why 2016.0.0 works and port the fix/workaround from that to the latest version (if possible).

Someone on my team just ran into this issue as well and using 2016.0.0 caused the problem to go away. Is there any idea on when this would be fixed? Thanks. This issue thread was very enlightening.

hungarianguy commented 5 years ago

can you temporarily put in the "Shutdown" recommendation into the Dispose while we are waiting for the dotnet team to ship their fix?

OR should we mitigate it by either:

  1. not calling Disconnect OR
  2. using one of the methods above
sbc3 commented 5 years ago

I'm still getting this issue in 2016.1.0. Downgrading to 2016.0.0 fixes the issue as others have said.

ganeshkamath89 commented 5 years ago

Now I am using the following code to avoid the issue:

                Task.Factory.StartNew(() => {
                    sshClient.Dispose();
                });

and kill these threads later.

Though the threads are blocked, the socket resources are in fact freed successfully before _messageListenerCompleted.WaitOne() is called.

Hi. I too am facing the same issue. Could you also mention where you ran this code snippet please

squalsoft commented 5 years ago

Seems like author don't support this package anymore. We need new fork of this for .net core!

drieseng commented 5 years ago

@squalsoft I have not abandoned SSH.NET, but I have little or no time to work on it.

If I try to find some time in the next days, can you build SSH.NET from source and validate the fix?

squalsoft commented 5 years ago

@drieseng i can build and test it. This is critical bug and this project dont’t have alternatives in .net core.