Support terminating a session with running commands

james-stocks commented 3 years ago

I have a customer request to be able to apply a timeout to commands, where we're using this library for PowerShell shells over WinRM.

When I try to send a command in one thread then have another thread call close, it terminates the running command as I want but it always throws this exception

    7: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
    6: from /Users/jstocks/code/train-winrm/lib/train-winrm/connection.rb:122:in `block in run_command_via_connection'
    5: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/shells/base.rb:79:in `run'
    4: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/shells/base.rb:138:in `with_command_shell'
    3: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/shells/base.rb:138:in `ensure in with_command_shell'
    2: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/shells/base.rb:151:in `cleanup_command'
    1: from /Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/shells/base.rb:151:in `new'
/Users/jstocks/code/bundles/train/ruby/2.7.0/gems/winrm-2.3.4/lib/winrm/wsmv/cleanup_command.rb:22:in `initialize': opts[:shell_id] is required (RuntimeError)

I'm unsure on first impression how this exception is possible when close will immediately return if there is no shell_id

It would be great to have a supported method for terminating a session when a command was running - even if it is understood to be an emergency brake with indeterminable side effects.

mwrock commented 3 years ago

My suspicion is that this is a byproduct of calling run on a new shell in a new thread. I'm guessing this came up as a result of https://github.com/inspec/train-winrm/pull/27. When you call run on the new thread, the shell sees that it has not been opened and then opens it assigning the shell id to @shell_id. Now I have been out of the ruby world for the last 4 years and just getting back in but I'd bet that instance variable @shell_id "goes away" when your thread dies and then close hits that error you mention in this issue.

I'm sure we could do some work to make this gem more thread safe, but I also think there is a workaround you could try. Try explicitly calling open on the shell after initializing it and before launching the new thread. That would create the @shell_id inside the parent thread that calls close.

james-stocks commented 3 years ago

Thanks very much for putting thought into the problem @mwrock

I did try calling open (since it's private I needed to use session.send(:open)) and the cleanup_command.rb:22:ininitialize': opts[:shell_id] is required` RuntimeError still occurs.

Looking at that error a second time, it does come from the wmrv/cleanup_command constructor and aside from tracing what's happening exactly with shell_id it might not be wise to attempt to send a cleanup command on a connection we just terminated a thread on.

What I'm adding in https://github.com/inspec/train-winrm/pull/27 is going to be an "emergency brake" - if you reach this timeout you are in an error state, you are going to get an exception, and you need to code some fix to not reach the timeout next time. This issue is asking for some supported way to terminate a command. I'm not sure with WinRM being HTTP based if this is feasible, and am OK if we just close this issue and require train-winrm to deal with any exceptions it gets from deciding to slam down the receiver.

mwrock commented 3 years ago

I'm looking at the stack trace again and now noticing it is coming from an ensure inside a method called by run so it is actually occurring on the same thread where @shell_id is assigned so my above theory is wrong. I think this is a race condition where train's call to close occurs before the cleanup. As I am thinking of this, I think if shell_id does not exist, we should assume the shell is closed and not attempt to cleanup.

mwrock commented 3 years ago

I've submitted #316 which would squash the annoying runtime error but, as mentioned in that PR, still does not ultimately address this issue. You would continue to run the risk of a command running indefinitely on a remote machine although the shell is closed. One thing you might consider is using an elevated shell and setting its execution_timeout property which takes a number of seconds. Since the command is run in a scheduled task, the task will end and therefore shut down its process when it exceeds that timeout.

james-stocks commented 3 years ago

Thanks @mwrock ! I reopened https://github.com/inspec/inspec/issues/1418 which was an existing request to use an elevated shell in InSpec, we'll look into it.

WinRb / WinRM

Support terminating a session with running commands #315