Question about running multiple commands against network appliances.

soap1337 commented 4 years ago

Greetings!

I just have a fast question here, i am STRUGGGLING to get async ssh to run more than one command per session/connection/channel/anything i always get hung up on executing multiple commands. my general goal here is to:

1.) connect to remote switch ( in this case cisco) [simple, i can do this today]
2.)execute a list of commands [ in my for loop it will only do the first in the list and then return the await statement
3.) write them to a file [easy]
4.) close the session [easy]

i am having the hardest time iterating through the list of commands, i can do 1 command for many hosts but not all the commands i need.

im working off the "multi_client" example in the docs and simply modifying it for how id like to use it, has anyone ever tried to do this?

thanks in advance!

ronf commented 4 years ago

Some folks have reported that Cisco switches sometimes have trouble with opening multiple SSH session over a single connection. If you are starting from the "multiple client" example, you may be running into that problem, but I would expect that to show up as an error when you attempted to open the additional sessions. Do you have any code you could post here which demonstrates the problem?

If you can get the multiple sessions thing to work, that gives you a much cleaner way to run multiple commands and capture the output from each command independently, without having to add any markers to the stream. However, there have been some previous discussions here about how to do this over a single session. If you need to go that route, you may want to take a look at the discussion in #227.

soap1337 commented 4 years ago

thanks for the fast response! so this is what i can get to work and it works flawlessly. I guess too i dont really know how to implement the "multiple sessions" yet. the 'cmds' list is the list of commands i am trying to run, currently it just takes a string as a cmd. and i have tried using the list in place of that, but no luck. when i try to do it a different way. i hit the elif 'result.exit_status !=0:'. i can get it to run all the commands but the script doesnt exit cleanly.

` import asyncio, asyncssh, sys, getpass from datetime import datetime

class async_test():

async def SSHcLient(host, command):
    async with asyncssh.connect(host, username=username, password=passwd, known_hosts=None) as conn:
            return await conn.run(command,  check=True) 

async def multi_client(list_input):
    hosts =  list_input

    tasks = (async_test.SSHcLient(host, 'show ver') for host in hosts)
    results = await asyncio.gather(*tasks, return_exceptions=True)

    for dev, result in enumerate(results, 1):
        file = open(hosts[dev-1] + '.txt', 'w+')
        if isinstance(result, Exception):
            print('Task %d failed: %s' % (dev, str(result)))
        elif result.exit_status !=0:
            print('Task %d exited with status %s:' % (dev, result.exit_status))
            print(result.stderr, end='')
        else:
            print('Task %d succeeded for device %s:' % (dev, hosts[dev-1]))
            #print(result.stdout + '\n', end='')
            file.write(result.stdout + '\n')

        print(25*'#')

if name == "main":

devs = ['dev1','dev2']
cmds = ['show mod', 'show ver']

passwd = getpass.getpass("passwd: ")
startTime = datetime.now()
loop = asyncio.get_event_loop()
loop.run_until_complete(async_test.multi_client(devs))
print('\nElapsed: ', datetime.now() - startTime)

`

ronf commented 4 years ago

Here are a few things I see:

In the code shown here, you aren't actually using "cmds", and the multi_client function is only ever trying to run "show ver".
You're opening up the the output file with mode "w+", which will overwrite the previous contents of the file each time. So, if you did loop over multiple commands, you'd probably run all of them but only see the output of the last command in the file. You might want to try using mode "a" instead so each open appends to the end of what you previously wrote.
You're not closing the file right now. While the garbage collection may eventually do that, you probably want to do it explicitly, especially if you are trying to open the same file multiple times in quick succession, to make sure previous data is flushed before you open the file again. One way to do that is to use "with" on the call to open().
- Only successful output is going to the file. All other output is still going to the console. That's probably why you can see it running multiple commands when you get an error.

If you want to run multiple commands on each host as different sessions on the same connection, you'll want to only do the asyncssh.connect() call once per host, but call conn.run() once for each command on that connection object. You can do this a few different ways depending on if you want to run the commands serially or in parallel on each host. If you want to run them serially, you'll also need to think about whether you want to keep running commands if a previous command returns a non-zero exit status.

Doing both the connections to the hosts in parallel and the commands in parallel on each host is a bit more complicated, but it's doable. You'd probably end up with a second gather() call in the "SSHcLient" call which looped over the list of cmds, calling conn.run() for each of those after doing the await on the call to connect(). You could have that return a list of result objects, and concatenate all those lists together into "results" at the top level before you loop over that.

soap1337 commented 4 years ago

thanks for the response! and yes this code is very greasemonkey/script kiddied together for a demo i was working on. Ideally i could have 1 file that has all the data i need in it but i was talking to some peers of mine and they suggested leaving it as it is (cleaning it up of course) and simply having multiple files for each type of output and then do something like an asyncio.create_task() and write each command output to a specific file and have our tools look for those specific files created vs reading millions of lines of text files.

the commands i am running are producing around 10k lines of out put to files on average. per device and i have to do a lot of devices :)

thanks for the help! when i my final script worked out for what Im trying to accomplish ill reply here. ill also try to do it the way you suggested for learning purposes.

thanks again for all the help!

babuloseo commented 4 years ago

I am also looking into this @ronf @soap1337 hopefully I should have something as well, looking for to it soap1337

ronf commented 4 years ago

One thing to keep in mind about calling run() is that all of the output is buffered in memory. If you have a really large amount of output, you might want to think about a different approach that writes the data to disk as it is coming in, rather than waiting for the command to complete and then writing it. Thankfully, AsyncSSH makes that very easy, as you can pass in arguments to tell it to automatically redirect stdin/stdout/stderr to/from files and it takes care of all of the incremental data pumping for you. That just leaves the exit status, which you can either write yourself to a different file to be checked later, or do other processing against that as soon as the it is returned.

To be clear, I definitely wasn't proposing putting everything in a single file. While you could do that if you used run() to buffer the individual command output in memory first, you'd have to worry about things like putting markers in the file to know when one command output ended and another began, and to identify each of the sections as far as which host and command they corresponded to.

Here's an example of using the stdio redirection I mentioned, and how you can run multiple commands on multiple hosts in parallel, writing output files per command containing the exit status, output to stdout, and output to stderr, or in the case of a connect failure, just a single status file for the host.

import asyncio, asyncssh

async def run_command(host, cmd, conn):
    """Run a command on a host and capture the exit status and output"""

    file_prefix = f'{host}_{cmd.replace(" ", "_")}'

    try:
        result = await conn.run(cmd, stdin=None, stdout=file_prefix + '_stdout',
                                stderr=file_prefix + '_stderr')
    except Exception as exc:
        status = f'Exception: {exc}'
    else:
        status = f'Exit status: {result.exit_status}'

    with open(file_prefix + '_status', 'w') as f:
        f.write(status)

async def run_commands(host, cmds):
    """Run a set of commands on a host"""

    try:
        conn = await asyncssh.connect(host)
    except Exception as exc:
        with open(f'{host}_status', 'w') as f:
            f.write(f'Exception: {exc}')
        return []
    else:
        return [run_command(host, cmd, conn) for cmd in cmds]

async def parallel_run(hosts, cmds):
    """Run a set of commands on a set of hosts in parallel"""

    results = sum([await run_commands(host, cmds) for host in hosts], [])
    await asyncio.gather(*results)

hosts = ('localhost', '127.0.0.1', '::1')
commands = ('echo foo', 'ls foo', 'sleep 5')
asyncio.run(parallel_run(hosts, commands))

babuloseo commented 4 years ago

The code seems to break if you only have one command in commands.

ronf commented 4 years ago

If you only have one command, you either need to use square brackets around it, or make sure to put a comma after that single command (inside the parentheses) so that the parenthesized expression becomes a tuple, rather than just a plain string in parentheses. Without the comma, the string will be treated as the sequence, and it'll treat each character in the string as a separate command to run.

luckydonald commented 4 years ago

You could add

if isinstance(cmds, str):
  cmds = [cmds]

That way, if cmds is a single string, it will be put in a single element list.

ronf commented 4 years ago

Yes, that would work, though I'd argue the function name of "run_commands" and the argument name of "cmds" indicates that a sequence of commands is expected there, rather than a single command string.

luckydonald commented 4 years ago

Fair point.

soap1337 commented 4 years ago

Ok, sorry busy number of weeks, so in my testing of this and playing with the different methods, my end result script its not entirely much different than what @ronf originally purposed. Heres the run down of what I discovered. In my use cases, I was polling/grabbing data from Cisco Switches and Some arista switches. For the cisco switches I was running commands against a wide variety of both OSs and OS Versions (IOS 12.2(17r) all the way up to NXOS 6.2.16 as well as IOS-XE 15.x no IOS_XR(Srry!))

The difficulties i ran into were MAINLY on the non NXOS and IOS-XE appliances and it had to do mostly with how asyncSSH was handling channels. I would consistently get this error when running the async script against older appliances:

2020-12-27 16:45:10.508881: Switch-1 : <class 'asyncssh.misc.ChannelOpenError'>

in this case the Switch-1 was a 6509-E running 12.2(17r)SX7

This error would only happen on the second command, i would get the output of the first command, proceed to the second command and produce that error. With the testing i did(based on my level of knowledge) i couldn't determine how to resolve it and continue the script, and what i ended up doing was writing a second slower script to handle the older appliances, which was fine in my case since they will be phased out soon.

Ill post my script here in the next couple of days, thanks!

ronf commented 4 years ago

It sounds to me like those old switches only support a single SSH session being created on each connection, which doesn't surprise me if they had a very bare-bones SSH implementation inside them. If that's the case, you'd rather have to open a new connection to those switches for each command, or you'd have to run all the commands sequentially on a single session, parsing the output to figure out yourself when each commands was done generating output. The former is probably simpler, but potentially a bit slower.

soap1337 commented 4 years ago

agreed, I think the strategy for me personally is to simply develop around the newer environments, the legacy gear is going away anyway, but I suppose for this module you can say it has been thoroughly tested against NXOS 5.x and newer, also sorry @babuloseo ill post my code asap so you can see what I did.

babuloseo commented 4 years ago

Honestly, I should be fine. I am more interested in trying to get authorisation/password prompt handling working as that has been a nightmare with asyncssh so far, I thought this thread was initially about that. I am mainly using the above code or the examples for one host and multiple commands. Currently, trying to find out the best performance for multiple commands. Anyway, thanks for the offer though:)

luckydonald commented 4 years ago

Regarding timeout problem, is maybe the other code running on the event loop taking too long, so that the ssh server disconnects?

ddutt commented 3 years ago

Old thread, but new input :)

The problem isn't gone away because of newer devices. IOS-XR suffers from the exact same problem reported above of only one command per connect! And this is even on their latest version! Sigh. Any new workarounds?

Best wishes, Dinesh

ddutt commented 3 years ago

More data. This code works i.e. as long as I can get in all the commands in before I do an await! If I add an asyncio.sleep in the commands for loop, the second command fails.

import asyncio
import asyncssh
import sys

class MySSHClientSession(asyncssh.SSHClientSession):
    def __init__(self):
        self._chan = None
        self._data = ''

    def data_received(self, data, datatype):
        data = data.strip()
        if data:
            self._data += data

    def connection_made(self, chan):
        self._chan = chan

    def eof_received(self):
        print(self._chan.get_command())
        print(self._data)
        print('\n')
        self._data = ''

async def run_client():
    options = asyncssh.SSHClientConnectionOptions(
        login_timeout=60,
        password="vagrant",
        username="vagrant",
        known_hosts=None)

    commands = ['show version', 'show run hostname']

    conn = await asyncssh.connect('192.168.121.248', options=options)

    for command in commands:
        chan, session = await conn.create_session(MySSHClientSession, command)

    await chan.wait_closed()

try:
    asyncio.get_event_loop().run_until_complete(run_client())
except (OSError, asyncssh.Error) as exc:
    sys.exit('SSH connection failed: ' + str(exc))

ddutt commented 3 years ago

For those who're interested, on IOS-XR, I got things going with the code below. @ronf, any way I can start a shell session and just keeping sending instead of doing "tail -f" ?

import asyncio
import asyncssh
import sys

async def run_client():
    options = asyncssh.SSHClientConnectionOptions(
        login_timeout=60,
        password="vagrant",
        username="vagrant",
        known_hosts=None)

    commands = [
        'show version', 'show run hostname']

    conn = await asyncssh.connect('192.168.121.248', options=options)
    _  = await conn.open_session('run tail -f /var/log/syslog')

    for command in commands:
        data = await conn.run(command)
        print(data.stdout)

    conn.close()
    await conn.wait_closed()

try:
    asyncio.get_event_loop().run_until_complete(run_client())
except (OSError, asyncssh.Error) as exc:
    sys.exit('SSH connection failed: ' + str(exc))

ronf commented 3 years ago

One thing I notice in the first version of code that you posted (using create_session() inside the for loop) is that you are creating multiple instances of "chan" and "session" but only keeping the latest ones around. This will lead to unpredictable behavior depending on when the garbage collector runs, potentially closing the channel before it has had a chance to run the command you provided. If you move the "await chan.wait_closed()" inside the for loop, it would fix that problem, but you're also not actually trying to read any of the command output in that version.

The second version which uses run() doesn't have that problem, as it implicitly waits for the channel to be closed before it returns. So, if that is working for you, I'd go with that. The only issue is that if the command response is large, it will need to all be buffered in memory before run() returns. If you just want to send the response to stdout, you could add something like "stdout=sys.stdout" as an argument to run() instead of having it collect the output and then printing it yourself when it's done.

One thing I'm still not really sure about is what you're doing with the "run tail -f /var/log/syslog" command. You're opening that session but never reading any of the output from it, or cleaning it up. It seems like you should be able to take that out, or make that another run() call.

ddutt commented 3 years ago

Hi @ronf , thanks for those helpful suggestions. I know in the first snippet I was calling everything fast and not really waiting. That was not real code I was using, but only showing it for illustrating what worked. In case of the second, I think I'll create a MyClientSSHSession like in your documentation to drain the output.

That run call is what makes the SSH connection persistent. If I close all presently open sessions, the Cisco appliance closes the SSH connection, and I want is a persistent SSH connection. I don't know how to create a session that for example just keeps sending \r\n to the input, thereby keeping the session alive. Am I clear in what I'm asking?

Dinesh

ddutt commented 3 years ago

I can confirm that this works also instead of the open_session() in the snippet above:

_ = await conn.create_process('run tail -f /var/log/syslog', stdout=DEVNULL, stderr=DEVNULL)

Would you agree that this is better than the previous code and also drains the messages?

Dinesh

ronf commented 3 years ago

Yes - calling createprocess() with redirection like that will cause it to discard all the output, but it's a bit dangerous to assign the result of that call to "". If you don't keep an active reference to the process, it may get garbage collected. Also, since we're talking about a "tail" command here, what's the point of running the command at all if you're just going to discard the output?

ddutt commented 3 years ago

Oh, OK, I can save it and not have it garbage collected.

Like I said, I need a command that's long running (potentially forever) because that ensures the Cisco router doesn't terminate the SSH connection after it services a command.

So, what I want is to a persistent connection to avoid connection setup/teardown. I'm running multiple commands periodically to gather data from the router. For every other router I've worked with so far, I do a connect() once and then continue to do conn.run() whenever I want to execute a command. This works without requiring me to do the connect each time. But with this Cisco router, it doesn't work. The moment conn.run() returns, the connection is closed. But if I first do the create_process() of the never ending tail, conn.run() doesn't teardown the connection on finishing the run.

Does my reason for doing this make sense? It's a workaround for the bug reported in this ticket, cannot run multiple commands.

Dinesh

ronf commented 3 years ago

Ah, I see. Does the router has any kind of setting for inactivity timeout that you could possibly change to avoid it closing the connection on you?

In a quick search, I found https://community.cisco.com/t5/wireless/to-increase-ssh-session-timeout/td-p/3098020, but I'm not sure if that'll be applicable to the specific kind of router you are using. Trying to find some kind of "exec" or "session" timeout seems like a good thing to look for, though.

If you're not able to find such a setting, I guess the "tail" command isn't a bad option as long as the syslog output isn't all that heavy.

ronf commented 3 years ago

Here's something IOS-XR specific: https://tools.cisco.com/security/center/resources/increase_security_ios_xr_devices.html

Search for "Set Exec Timeout". It looks like it defaults to 10 minutes.

ddutt commented 3 years ago

Thanks for all the searching @ronf. It doesn't even take a sec, let alone 10 mins for the connection to end. The moment the first command is finished, the connection ends, unless I have the tail equivalent running. I guess I can find a really dead or slow filling log instead of syslog.

Dinesh

ronf commented 3 years ago

What's curious here is that you seem to have no problem opening multiple sessions on a single SSH connection, each running their own independent command. In the past, one of the problems with embedded SSH servers is that they don't always support creating more than one session on a connection, but that doesn't seem to be a problem here.

Perhaps the issue is that the SSH server on the router is closing the connection as soon as the last session closes. That would explain why you need to start a long-running session first, before running the other short-lived sessions.

ddutt commented 3 years ago

Makes sense

ddutt commented 3 years ago

Since it just needs to be a long running process, doing a "tail -s 3600 -f /etc/os-version" or some fixed file is even better don't you think?

ronf commented 3 years ago

Closing due to inactivity. Please open a new issue if you have additional questions.

jimguthrie commented 3 years ago

Hi Ron,

I wanted to add a comment that explains this behavior explicitly: When the session ends it sends an 'exit' code for the session down the line - it actually shows up as an 'exit' command on the Cisco command buffer, which is how you close the entire connection within the device CLI. I can reproduce it on a few different versions of cisco gear with the "show history all" So I believe letting the long running processes go is just prolonging the exit status send function, and slipping other stuff in before it gets executed.

If I elevate within the program (move to a nested menu), it keeps returning 'exit' commands until it drops out of elevation - which if I read the code right, makes perfect sense as a session cleanup loop.

In summary - it makes perfect sense from both angles: When interacting with a network device, 'exit' is how a human will cleanly kill the entire connection(and session). This just also happens to overlap with the way sessions are handled for multi-session applications. So I don't know that there is any way to really avoid that without manually controlling the exit status.

ronf commented 3 years ago

When you say "it keeps returning 'exit' commands", what is the "it" here? Are you talking about an SSH client sending "exit" commands repeatedly until the SSH server closes the session? Also, how does the exit status fit into this? The exit status would be set by the SSH server and sent from the server to the client, whereas "exit" commands would go the other way, from the client to the server, and there shouldn't be an exit status returned from the server to the client until it is time for the server to close the session (possibly triggered by receiving an "exit" at the top level).

For servers that support multiple sessions on a single connection, all of the above would be completely independent per-session. Sending "exit" commands on a session until it closes should work, but it should only affect the specific session you are sending "exit" to, and not any other sessions you may have open. Similarly, if a session does close and return an exit status, that shouldn't have any impact on any other open sessions.

jimguthrie commented 3 years ago

Here, let me show you exactly what I ran - as I might be explaining things poorly:

Script I'm running:

import asyncio, asyncssh, sys

class MySSHClientSession(asyncssh.SSHClientSession):
    def data_received(self, data, datatype):
        print(data, end='\n')

    def connection_lost(self, exc):
        if exc:
            print('SSH session error: ' + str(exc), file=sys.stderr)

old_algs ='aes256-cbc,aes192-cbc,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,seed-cbc@ssh.com,arcfour256,arcfour128,arcfour'
async def run_client():
    async with asyncssh.connect('192.168.49.10', username='xxxx', password='xxxxx', encryption_algs=old_algs) as conn:
        chan, session = await conn.create_session(MySSHClientSession, 'show ip arp')
        await chan.wait_closed()

try:
    asyncio.get_event_loop().run_until_complete(run_client())
except (OSError, asyncssh.Error) as exc:
    sys.exit('SSH connection failed: ' + str(exc))

This is the command history on the Router, as interpreted in session:

*Oct 13 18:05:07.843: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: xxxxx] [Source: 192.168.49.100] [localport: 22] at 18:05:07 UTC Wed Oct 13 2021
CMD: 'show ip arp' 18:05:07 UTC Wed Oct 13 2021
CMD: 'exit' 18:05:07 UTC Wed Oct 13 2021

You can see where the script logs in, executes the command I had in the string, and then sends an 'exit' command directly into the cisco command interpreter, and obviously isn't anywhere in my code.

Another example of output, but this time I used a command that 'elevates' you within the router ('conf t'). So you have to use a command "exit" (or end) to drop down, and then if you type exit again it closes the session out entirely. I did this test to try and discern if the CMD buffer history was misrepresenting SSH control overhead:

*Oct 13 18:09:34.135: %SEC_LOGIN-5-LOGIN_SUCCESS: Login Success [user: xxxx] [Source: 192.168.49.100] [localport: 22] at 18:09:34 UTC Wed Oct 13 2021
CMD: 'conf t' 18:09:34 UTC Wed Oct 13 2021
CMD: 'exit' 18:09:34 UTC Wed Oct 13 2021
*Oct 13 18:09:34.199: %SYS-5-CONFIG_I: Configured from console by xxxxx on vty0 (192.168.49.100)
CMD: 'exit' 18:09:34 UTC Wed Oct 13 2021

you can see here it sends another exit statement on top of the first one. I could be entirely off base, but I think it's an exit code being sent from the client to the server and being mis-interpreted as a command on the channel. Then as it sees the session hasn't closed (because it dropped down instead of closed) it sends it again to clean it up (and kill the session)

ddutt commented 3 years ago

@clandestinefool : I'm not sure I totally follow you, but I think you're interpreting the two exits incorrectly. Network OSes are a modal CLI model, where every command except a show (and a few others) drop you into a level. To exit from the SSH session, you have to exit from each deeper level all the way to the top. Some NOSes offer an "end" option to exit to the top right away.

And to your original point. The behavior being described as an issue only affects Cisco IOS. It doesn't affect Cisco's NXOS, Arista, Cumulus, Juniper and so on.

jimguthrie commented 3 years ago

@ddutt That's not precisely true, as you can see I get input errors and ambiguity flags trying different commands. Though to your point it may just purely be a quirk of old world IOS.

R1(config)#this
            ^
% Invalid input detected at '^' marker.

R1(config)#does
             ^
% Invalid input detected at '^' marker.

R1(config)#not drop
             ^
% Invalid input detected at '^' marker.

R1(config)#ex
R1(config)#ex
% Ambiguous command:  "ex"
R1(config)#

ddutt commented 3 years ago

Sorry, I miscommunicated. The commands have to succeed, and not every command drops you down a level, but many do, for example:

conf t
int Eth1/1
no shut
exit
exit
exit

or

conf t
router bgp 64502
address-family ipv4 unicast
exit
exit
exit

ronf commented 3 years ago

@clandestinefool Thanks for clarifying.

My guess as to what's going on here is that the "exit" messages you are seeing in the log are being inserted by the Cisco device. As @ddutt mentioned, you'll see multiple "exit" commands in the log output if CLI is in a nested "sub-mode" such as config mode, or even something deeper where you are configuring a particular subsystem. It looks like for logging purposes it inserts enough "exit" commands to exit all the way to the top level in order to show the session was ended.

I think you'd probably see "exit output like this in a couple of different cases:

Passing in a command to run on the conn.create_session() call (or other variants like run() or create_process()). In this case, the SSH protocol expects the server to end the session as soon as that command completes, and an interactive "shell" is not opened, though you might still be able to provide input on that session if the command you run needs it before it exits.
Calling write_eof() explicitly on an interactive "shell", where you don't pass in what command to run when creating the session. In this case, since no more input can come from the client, the shell is going to keep backing out of the sub-modes it is in until it gets back to the top level, at which point it would exit and your client call to wait_closed() would return.

None of this should have any impact on any other sessions which were opened in parallel, but perhaps some Cisco devices don't handle that correctly, maybe exiting out of all sessions when one of them closes (or at least when the first one closes). That could explain why running a long-running command on the first session allows other sessions to be opened (and closed) successfully, but otherwise things close prematurely.

network-shark commented 3 years ago

There is a library https://pypi.org/project/scrapli/ which uses asyncshh under the hood , specifically build for dealing with network devices.

jimguthrie commented 3 years ago

@ronf Thanks for taking the time to respond. As an aside, your project has made me appreciate how many little things I've taken for granted when it comes to SSH.

@network-shark Yeah! Carl's project is great, I've just been digging into some lower level stuff.

ronf commented 3 years ago

Thanks for the kind words, @clandestinefool - I'm glad to hear that you're finding AsyncSSH useful. It has been fun to work on.

ronf / asyncssh

Question about running multiple commands against network appliances. #241