Closed asomers closed 2 years ago
Hi there,
Thanks for the interest.
Reading partial output from commands that do not terminate is described in the documentation.
Using the example as in the documentation, the behaviour is as expected:
output = client.run_command(
'while true; do echo a line; sleep .1; done',
use_pty=True, read_timeout=1)
# Read as many lines of output as hosts have sent before the timeout
stdout = []
for host_out in output:
try:
for line in host_out.stdout:
stdout.append(line)
except Timeout:
pass
# Closing channel which has PTY has the effect of terminating
# any running processes started on that channel.
for host_out in output:
host_out.client.close_channel(host_out.channel)
# Join is not strictly needed here as channel has already been closed and
# command has finished, but is safe to use regardless.
client.join(output)
# Can now read output up to when the channel was closed without blocking.
rest_of_stdout = list(output[0].stdout)
The documentation describes how to close the remote command. No thread pools should be used - channels cannot be shared (safely) across threads with parallel-ssh.
The duplicated output is because of the use of threads - there are multiple channels using the same socket across threads, both of them will have output.
Can re-open if there is an issue with documented example as shown above.
Use the high level APIs in clients for writing to stdin, not OS pipes. They will not work.
What's wrong with my use of pipes? I'm not actually passing a pipe to parallel-ssh. I'm reading from the pipe and then writing with channel.stdin.write
. Is there a problem with that? For my application, I must use pipes because the library that produces the data sends it to a file descriptor.
I don't think I can completely eliminate threads. That same library requires a thread. But that thread (_th_send
) doesn't access the parallel-ssh channel. So why is that a problem?
Finally, while I did see the stuff about read_timeout
in the docs, that approach won't work for me. I expect much more activity on stdin than on stdout. If I have to wait for a 1 second timeout on every loop iteration that would be too slow. That's why I use select
to multiplex the file descriptors. Is there any way to use stdout in a truly non-blocking mode?
Is there any way to use stdout in a truly non-blocking mode?
Sure, have a look at the poll command.
If you're not passing data to/from an SSH channel with pipes, that's fine. Pipes are inherently blocking, so you can't use them for network communication of SSH traffic. Reading/writing from them inside a shell is fine, that's standard shell output.
The problematic code, as far as I can see without minimal code to reproduce that can be run anywhere, is this block here:
with ThreadPoolExecutor() as executor:
fut = executor.submit(self._th_send, pin)
status = self._feed_pipes(pout, channel)
channels cannot be used in a thread pool in this way. Things will break. Can call executor.submit
, exit out of the thread pool and then do the SSH channel operations, that would be fine. Just don't share channels/client objects across threads. Though this may not be related if an executor is not actually using the channel.
If a standalone piece of code that I can run locally can be shown I can take a look. Otherwise just guessing from looking at code.
The other thing is you need gevent.select
not standard python select. All low level networking imports must come from gevent if you want low level functionality. But that's what SSHClient.poll
does, just use that.
Also these:
try:
channel.stdin.write(data)
channel.stdin.flush()
These are blocking calls. You need HostOutput.stdin.write
as returned by run_command
.
It sounds like a combination of a low read_timeout
and a SSHClient.poll
before writing will do what you want. But again, just guessing.
What's wrong with the example I posted above? Can you not run that locally?
Can tell it doesn't work from all the blocking calls - select, channel.write etc. Once those are changed, does it work? The issue tracker is for actionable bugs, not instructions how to use.
Background
I'm trying to execute a long-running command on a remote server. It needs to be fed data through stdin, and it may occasionally print stuff through stdout and stderr. It will not terminate until it gets EOF on its input. I have a working program that uses Paramiko. It multiplexes over the SSH channel and another file descriptor with
select
, and reads from both sources in non-blocking mode. However, my attempt at using parallel-ssh exhibits a few bugs.Describe the bug
for line in channel.stdout
?channel.stdout
. How can I do that in a nonblocking way?select
always returns that the channel is readable, butexit_code
is alwaysNone
so the program busy loops. Why doesn'texit_code
get set?To Reproduce
Execute this program like this:
python3.9 cat-parallelssh.py my-host
. To see problem 3, change therange
arguments to0, 10
.Expected behavior It should print the numbers 0 through 999, inclusive, each on its own line and then terminate. For comparison, this paramiko program does just that:
Actual behaviour It typically prints the numbers 0 through 21 twice and then hangs. The exact numbers printed varies from run to run.
Additional information ssh2_python-0.27.0