fgimian / paramiko-expect

A Python expect-like extension for the Paramiko SSH library which also supports tailing logs.
MIT License
205 stars 78 forks source link

paramiko-expect: expect loop take long time with a long output #48

Open migubarh opened 5 years ago

migubarh commented 5 years ago

Hi,

I am starting using python as migration for old tcl scripts. paramiko-expect has been working just perfect, thanks for the tool... but recently we faced this issue when we need to take a CLI output from our servers that is very long. we notice that expect command never ends and python starts to consume lot of CPU for long time.

we inspect the code and suspect this check for the prompt using the "self.current_output" start to get very slow.

do you see any way to optimize this part?

thank you!

fruch commented 5 years ago

we can maybe turn it into cyclic buffer, and you'll need to define its size

sar772004 commented 5 years ago

Also seeing this issue in this block for few thousand line output. can you elaborate on the proposed fix ? not [re_string for re_string in re_strings if re.match(default_match_prefix + re_string + '$', self.current_output, re.DOTALL)]

fruch commented 5 years ago

The issue is that currently we cached all output, and doing a regex on all of it. at a certain size it becomes an issue. I.e. running it for day might become very slow, and you can run out of memory.

My suggest is to use a cyclic buffer, so that the amount of data we save in memory would be up to a max the can be defined on the constractor.

gstodorov commented 4 years ago

Hi I know this is an old thread but I have changed the code a bit to compare last line from current_output + current_buffer_decoded instead of the whole current_output. Works for me and it increased the speed significantly (from 30+ minutes down to less than a minute)

        # Create an empty output buffer
        self.current_output = ''
        current_buffer_decoded_compare = ''

        while (
            len(re_strings) == 0 or
            not [re_string
                 for re_string in re_strings
                 if re.match(default_match_prefix + re_string + '$',
                             current_buffer_decoded_compare, re.DOTALL)]
        ):

            # Add the currently read buffer to the output
            self.current_output += current_buffer_decoded
            current_buffer_decoded_compare = self.current_output.splitlines()[-1] + current_buffer_decoded

When calling the class im also increasing the buffer: interact = SSHClientInteraction(client, timeout=10, display=False, buffer_size=65535)

sar772004 commented 3 years ago

Please check 0.3.0, should have a fix for this. If it works for you we can close thi