carlmontanari / scrapli

Fast, flexible, sync/async, Python 3.7+ screen scraping client specifically for network devices
https://carlmontanari.github.io/scrapli/
MIT License
579 stars 61 forks source link

Transport Inconsistency #260

Closed eseglem closed 2 years ago

eseglem commented 2 years ago

Describe the bug When using different transport layers (telnet vs ssh) the read / write commands don't quite seem to behave the same way. Meaning the Driver has to understand what Transport is being used to function consistently across all transports. Based on my interpretation of the docs it would seem they should be transparent to the docs.

It is definitely possible the device I am connecting to is behaving poorly but that is harder to tell. I can sit in wireshark and read all of telnet, but not so much when its ssh.

I am working on a python API wrapper for more recent WattBoxs, where they no longer have the nice REST API and only have a telnet / ssh API. Its identical in its behavior between the two protocols, just how you connect is different.

Once you are logged in you are met with Successfully Logged In!\n so you are in a terminal on a new line with nothing else on it. You can type in various input commands starting with either ? (to get info) or ! (to set things) and it will respond back with your ? and additional info or an Ok\n or #Error\n Docs Here if they matter.

To Reproduce

from scrapli import Driver

MY_DEVICE = {
    "host": "...",
    "auth_username": "...",
    "auth_password": "...",
    "auth_strict_key": False,
}

tconn = Driver(
    **MY_DEVICE,
    port=23,
    transport="telnet",
    # This was the best way I found to get it through all the login prompts
    comms_prompt_pattern=r"^Successfully Logged In!\n$"
)
tconn.open()
# It doesn't really have "prompts", just sends a line, this seems to work 
tconn._base_channel_args.comms_prompt_pattern = r"^.*$"

# Send a command
tconn.channel.write("?Firmware\n")
print(tconn.channel.read())
# b'?Firmware=2.2.1.0\n'

# Exactly the same settings and commands other then port / transport.
sconn = Driver(
    **MY_DEVICE,
    port=22,
    transport="ssh2",
    comms_prompt_pattern=r"^Successfully Logged In!\n$"
)
sconn.open()
sconn._base_channel_args.comms_prompt_pattern = r"^.*$"

# Two extra `.read` here necessary to get through the logged in message
print(sconn.channel.read())
# b'Connecting...\n'
print(sconn.channel.read())
# b'Successfully Logged In!\n'

# Send the same command
conn.channel.write("?Firmware\n")
print(sconn.channel.read())
# b'?Firmware\n?Firmware=2.2.1.0\n'
# Where the `.read` includes what I sent with `.write` and the response from telnet read.

Expected behavior Consistent behavior between transport layers so I can write one Driver which can use telnet, asynctelnet, ssh2, or asyncssh to control the system depending on the user's needs.

OS (please complete the following information):

Additional context I also cannot use the channel.send_input command because it will .write then _read_until_input which reads the actual response but tosses it, then sends an extra \n which gets an #Error response. And if I don't include the \n in my message it won't ever get anything for _read_until_input since the server is still waiting on the \n.

If this is expected behavior, any tips for working around this without having to write a bunch of extra logic around the transport in a Driver?

carlmontanari commented 2 years ago

Not having a prompt is going to make this miserable for you in general i think!

probably will need logs, but basically you can never assume that a read will return the same thing. Read from a channel reads whatever bytes are available. The fact that you read what should have already been read after the open call (because of your prompt pattern), is not ideal though and probably shouldn’t be happening. Logs should help us see what’s up if anything there though. Telnet auth happens “in channel” though where ssh2/paramiko/asyncssh does not, so there may be some things getting consumed from the channel in telnet case but not the other. Again, without a prompt to read up until scrapli can’t do much, as you’ve seen by needing to drop onto the channel!

wrt to send command, you should not be putting a new line char in there, scrapli does that for you. That said, again without the prompt you probably can’t use driver methods because scrapli needs a prompt to read to.

And if I don't include the \n in my message it won't ever get anything for _read_until_input since the server is still waiting on the \n.

This should not be the case. Unless your device literally does not show the characters as you type them. Otherwise scrapli ships your input then reads the chars it sent off the channel.

long story long, it doesn’t sound (so far, happy to be proven wrong though!) that there is any issue with scrapli expected behavior here. To be totally honest I don’t know if scrapli is the right tool here. You absolutely can use it and it can probably save you some hassle with spinning up connections and doing some basic stuff — but you will probably be having to do all the work yourself by reaching into the channel as you’ve shown here, and handling all the reading/wiring logic that way. Access to that is of course an intended part of scrapli, but usually you only want to do that as a last resort.

So yeah, tldr — logs would be great and hopefully my Sunday evening rambling makes sense and gives us some next steps :grin:

eseglem commented 2 years ago

Yeah, not having a real prompt definitely isn't ideal. I was trying to treat the \n as the prompt but that didn't seem to work. I may have been doing something else wrong but _process_read_buf seems to partition on it and prevent matching on it. So stuck with the bare read for the most part. Been a lot of experimenting with things a bit to figure out how everything is behaving.

Despite how it may look Scrapli actually feels like the best tool for the job. Even if I have to deal with the channels for the most part. Its not that bad. Finding Scrapli was a great leap forward in progress for this. I had to start with sync telnet watching with wireshark until I got it right. And then ssh2, asyncssh, and asynctelnet all fell right into place. At least as far as connecting to them from python goes. Whereas I originally started trying to work with asyncssh directly, and I never managed to get it connected to the server. Connecting works great inside Scrapli though. And as far as I have found its the only library that actually does asynctelnet. So even if I have to go deeper into Scrapli than most it at least manages Telnet and SSH connections in both Sync and Async fashion for me. All in one interface. And I can concentrate on the sending commands and parsing responses.

Prompt After Login

I think because telnet does the auth in the channel the read at the end of open pulls the "Successfully Logged In!" out, but it doesn't appear to even try with ssh2. Or it may just not in the channel yet when it does?

If I need to I can probably do something in a Driver based on the transport method it picks if it needs the extra read or not. Just didn't expect the slight difference between the two.

Read Until Input

I do think there is a difference in how Telnet transport functions compared to the SSH. Telnet Write is just raw socket sending. So the only thing that ends up in the channel buffer is the response from the server. Making send_input not work, because the _read_until_input reads either nothing if I don't include the \n since the server doesn't respond at all without it. Or if I include \n it reads the actual response from the server. And then it sends its own \n and the server returns #Error\n because it doesn't understand the extra \n. And send_input returns that #Error\n thinking its the response, but it already ate the response.

Where SSH2 Write is writing to the session channel. Which I appears to stay there and needs to be read off. The send_input function works better here. But perhaps I misunderstood what its supposed to do. It doesn't appear to read the response for me, just what is sent to the server. And I still need to do an additional read.

Logs

carlmontanari commented 2 years ago

Thanks for grabbing the logs!

I’ll take a closer look this weekend, but in the meantime quick comments:

Will check out the logs and try to respond more this weekend!

carlmontanari commented 2 years ago

Got a bit more time so some more detailed responses:

I think because telnet does the auth in the channel the read at the end of open pulls the "Successfully Logged In!" out, but it doesn't appear to even try with ssh2. Or it may just not in the channel yet when it does?

You are correct -- because you are opening things with the comms_prompt_pattern set to "Successfully Logged In!" the telnet (and asynctelnet and system) transport will open the connection, read the username/password prompts and deal with that, then read until it finds the "initial" prompt. SSH2/paramiko/asyncssh handle all this auth business I guess at "the protocol" level, so they are not reading the bytes as you and I (and telnet/asynctelnet/system transports) do when looking at a terminal.

If I need to I can probably do something in a Driver based on the transport method it picks if it needs the extra read or not. Just didn't expect the slight difference between the two.

Normally this would be handled by an on_open function. This thing is just a callable that you can give to scrapli to do "setup" type things on a newly opened connection. Normally this would be things like disable pagination and the like. In your case you could just have an on_open maybe something like this (obviously pseudocode!):

def on_open(driver generic.Driver) -> None:
    if driver.transport not in (telnet, asynctelnet, system):
       # consume that initial login prompt stuff off the channel

    driver.channel.write("some prep command or not, up to you!")

I do think there is a difference in how Telnet transport functions compared to the SSH

In scrapli-netconf we deal with this (I've not seen it in "normal" telnet/ssh but perhaps similar thing is happening here?) -- I've unimaginatively called this "server echo" -- some devices "echo" the things we write to them back on to the channel, others just accept it and move on. For those that "echo", We then will obviously read that back when we consume form the channel. This is also why the eager flag exists on send_commandX methods -- this flag skips reading back our input of the channel and simply "eagerly" sends the return after writing the input bytes.

Or if I include \n

Generally I would recommend against including this, but of course if you write directly to the channel (channel.write) then you obviously need it!

So as to the original point of the issue -- inconsistencies between telnet/ssh...

I would be curious if system transport behaves the same as telnet or as ssh2/paramiko/asyncssh. I'd also want to know if ssh2/paramiko/asyncssh all behave the same. And if telnet/asynctelnet behave the same. I guess the device could behave slightly differently with telnet vs ssh (echo vs not perhaps) -- if all the ssh transports behave the same and the telnet ones behave the same I'd say "no issue here" (wrt scrapli I mean) and you could then wrap scrapli things with your custom logic to handle the difference.

Regardless, I am probably not super inclined to make any changes here since this is the first time this particular quirk has come up.... but im happy to try to help talk through things and get ya running full steam ahead, and, all that said, im always open to being convinced to change my mind about changing things or whatever as well 😁

HTH!

Carl

eseglem commented 2 years ago

I just got thrown off because effectively the telnet ones do a _read_until_prompt for you during the open, where the ssh ones do not, and you have to do it yourself. I initially expected all of them to get to the same spot when calling open, but not a big deal.

I was thinking about doing that same thing by overriding open, but an on_open does seem to be nicer.

Eager sadly doesn't work for what I am seeing.

        with self._channel_lock():
            self.write(channel_input=channel_input)
            # timeout happens here \/ since there is nothing to read and nothing coming until a `\n` is sent
            _buf_until_input = self._read_until_input(channel_input=bytes_channel_input)  
            self.send_return()

            # eager doesn't do anything until down here, after an exception is already raised
            if not eager: 
                buf += self._read_until_prompt()

Telnet / AsyncTelnet have behaved the same so far. And SSH2 / AsyncSSH have behaved the same so far. If I remember correctly I didn't manage to get System to connect at all, and haven't checked Paramiko. I'll try to check both of those next time I take a crack at it.

I am way out of my element here. Quite different than the REST APIs I am used to. So, really not sure what is sensible and what isn't. Asking Telnet and SSH to behave the same probably is a bit of a stretch. And since this device seems to be quite unique, I can't imagine it makes any sense to add potentially breaking changes to the library when I can just work around them.

carlmontanari commented 2 years ago

Eager sadly doesn't work for what I am seeing

thats right, you already mentioned that, sorry!

telnet things behaving the same and ssh things behaving the same makes me feel like things are reasonably well behaved so thats good to hear!

I guess at this point I'll go ahead and close this out -- feel free to reach out or move this to a discussion (so I can be at zero issues to appease my ocd!) if ya wanna chat more. Hoping you can get things working nicely!

Carl