knipknap / exscript

A Python module making Telnet and SSH easy
MIT License
365 stars 136 forks source link

Getting info about exceptions when using Queue #110

Closed braincrash closed 7 years ago

braincrash commented 9 years ago

I'm having trouble catching this Exception, can anyone help me out?!? I want to be able to use this message buffer, they might be different!

Thanks

192.168.1.189 error: Buffer: 'Remote management disabled or remote management IP is limited !!!\r\nReject ...\r\n'
Traceback (most recent call last):
  File "/Exscript/src/Exscript/workqueue/Job.py", line 64, in run
    self.function(self)
  File "/Exscript/src/Exscript/Queue.py", line 91, in _wrapped
    result = func(job, host, conn, *args, **kwargs)
  File "/Exscript/src/Exscript/util/decorator.py", line 103, in decorated
    conn.login(flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 627, in login
    self.authenticate(account, flush = False)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 651, in authenticate
    self.app_authenticate(app_account, flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 819, in app_authenticate
    self._app_authenticate(account, password, flush, bailout)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 723, in _app_authenticate
    raise TimeoutException(msg)
TimeoutException: Buffer: 'Remote management disabled or remote management IP is limited !!!\r\nReject ...\r\n'

192.168.1.189 finally failed.
egroeper commented 9 years ago

I'm not sure, if I understand your problem correctly. TimeoutException: Buffer generally means, that exscript didn't manage to catch a prompt in time. Looking at the buffer contents it seems like the driver you are using (which one is it?) misses an error prompt. Unfortunately fixing this would only mean, that exscript could detect the error. It would still be there. exscript would then throw an InvalidCommandException.

The following doesn't work?

try:
    <your code>
except TimeoutException as te:
    print te

If this is about fixing / improving the driver used or exscript, it is the right way to post this here. Otherwise you should use the mailing list for your question.

braincrash commented 9 years ago

My code is identical to yours, I tried this expression and it doesn't work.

Also a more generic one

try: code except: print "something"

Its not a problem with prompt, because I never get the prompt, its an error return by the router when the telnet is disabled. If you have any ideas on how to deal with it!! I will appreciate it! Thanks

braincrash commented 9 years ago

I've made an update to the driver:

_user_re     = [re.compile(r'User Name : ', re.I)]
_password_re = [re.compile(r'User Password : ', re.I)]
_prompt_re   = [re.compile(r'(\w+) :\> ', re.M|re.S|re.I)]
_login_fail_re = [re.compile(r'[\r\n]invalid password', re.I),
                  re.compile(r'unable to verify password', re.I),
                  re.compile(r'emote management disabled or remote management IP is limited', re.I),
                  re.compile(r'unable to login', re.I)]
_error_re = [re.compile(r'%Error'),
             re.compile(r'invalid input', re.I),
             re.compile(r'(?:incomplete|ambiguous) command', re.I),
             re.compile(r'connection timed out', re.I),
             re.compile(r'[^\r\n]+ not found', re.I)]

class ADBPirelli1000Driver(Driver):
    def __init__(self):
        Driver.__init__(self, 'ADBPirelli1000')
        self.user_re     = _user_re
        self.password_re = _password_re
        self.prompt_re   = _prompt_re
        self.login_error_re = _login_fail_re
        self.error_re    = _error_re

    def check_head_for_os(self, string):
        if _user_re[0].search(string):
            return 70
        return 0

Now it returns

192.168.1.189 error: Login failed
Traceback (most recent call last):
  File "/Exscript/src/Exscript/workqueue/Job.py", line 64, in run
    self.function(self)
  File "/Exscript/src/Exscript/Queue.py", line 91, in _wrapped
    result = func(job, host, conn, *args, **kwargs)
  File "/Exscript/src/Exscript/util/decorator.py", line 105, in decorated
    conn.login(flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 627, in login
    self.authenticate(account, flush = False)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 651, in authenticate
    self.app_authenticate(app_account, flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 819, in app_authenticate
    self._app_authenticate(account, password, flush, bailout)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 738, in _app_authenticate
    raise LoginFailure("Login failed")
LoginFailure: Login failed

192.168.1.189 finally failed.

I try except could work with this it would be great, but the try doesn't catch this exception :(

braincrash commented 9 years ago

Does this help?

Enabled debug level 5

generic: Attempting to authenticate admin.
generic: Attempting to app-authenticate admin.
generic: waiting for: ['[\\r\\n][^\\r\\n]*(?:bad secrets|denied|invalid|too short|incorrect|connection timed out|failed|failure)', '(user ?name|user|login): *$', '(?:s\\/key|otp-md4) (\\d+) (\\S+)', 'password:? *$', '[\\r\\n](?:[^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~\\ \\\t\\\n\\\r\\\x0b\\\x0c]*|[\\x1b\\x07\\x00]*)[\\[\\<]?\\w+(?:(?:(?:[\\w+\\-]+)\\@)?(?:[\\w+\\-\\.]+))?:?(?:(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)|~(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)?)?[: ]?(?:(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)|~(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)?)?(?:\\((?:[\\w\\+\\-\\._]+)\\))?[\\]\\-]?[#>%\\$\\]] ?[^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~\\ \\\t\\\n\\\r\\\x0b\\\x0c]*\\Z']
Telnet(192.168.1.189,23): Expecting ['[\\r\\n][^\\r\\n]*(?:bad secrets|denied|invalid|too short|incorrect|connection timed out|failed|failure)', '(user ?name|user|login): *$', '(?:s\\/key|otp-md4) (\\d+) (\\S+)', 'password:? *$', '[\\r\\n](?:[^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~\\ \\\t\\\n\\\r\\\x0b\\\x0c]*|[\\x1b\\x07\\x00]*)[\\[\\<]?\\w+(?:(?:(?:[\\w+\\-]+)\\@)?(?:[\\w+\\-\\.]+))?:?(?:(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)|~(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)?)?[: ]?(?:(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)|~(?:(?:(?:[\\w\\+\\-\\._]+))?(?:/(?:[\\w\\+\\-\\._]+))*/?)?)?(?:\\((?:[\\w\\+\\-\\._]+)\\))?[\\]\\-]?[#>%\\$\\]] ?[^0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ\\!\\"\\#\\$\\%\\&\\\'\\(\\)\\*\\+\\,\\-\\.\\/\\:\\;\\<\\=\\>\\?\\@\\[\\\\\\]\\^\\_\\`\\{\\|\\}\\~\\ \\\t\\\n\\\r\\\x0b\\\x0c]*\\Z']
Telnet(192.168.1.189,23): recv 'Remote management disabled or remote management IP is limited !!'
ADBPirelli1000: Protocol: driver replaced: generic -> ADBPirelli1000
Telnet(192.168.1.189,23): cancelling expect()
ADBPirelli1000: Response was ''
ADBPirelli1000: Protocol.app_authenticate(): driver replaced
ADBPirelli1000: waiting for: ['emote management disabled or remote management IP is limited', 'User Name : ', '(?:s\\/key|otp-md4) (\\d+) (\\S+)', 'User Password : ', '(\\w+) :\\> ']
Telnet(192.168.1.189,23): Expecting ['emote management disabled or remote management IP is limited', 'User Name : ', '(?:s\\/key|otp-md4) (\\d+) (\\S+)', 'User Password : ', '(\\w+) :\\> ']
ADBPirelli1000: Got a prompt, match was 'emote management disabled or remote management IP is limited'
ADBPirelli1000: Response was 'Remot'
192.168.1.189 error: Login failed
Traceback (most recent call last):
  File "/Exscript/src/Exscript/workqueue/Job.py", line 64, in run
    self.function(self)
  File "/Exscript/src/Exscript/Queue.py", line 91, in _wrapped
    result = func(job, host, conn, *args, **kwargs)
  File "/Exscript/src/Exscript/util/decorator.py", line 105, in decorated
    conn.login(flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 627, in login
    self.authenticate(account, flush = False)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 651, in authenticate
    self.app_authenticate(app_account, flush = flush)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 820, in app_authenticate
    self._app_authenticate(account, password, flush, bailout)
  File "/Exscript/src/Exscript/protocols/Protocol.py", line 739, in _app_authenticate
    raise LoginFailure("Login failed")
LoginFailure: Login failed

192.168.1.189 finally failed.

Driver

import re
from Exscript.protocols.drivers.driver import Driver

_user_re     = [re.compile(r'User Name : ', re.I)]
_password_re = [re.compile(r'User Password : ', re.I)]
_prompt_re   = [re.compile(r'(\w+) :\> ', re.M|re.S|re.I)]
_login_fail_re = [re.compile(r'emote management disabled or remote management IP is limited', re.I)]
_error_re = [re.compile(r'invalid input')]

class ADBPirelli1000Driver(Driver):
    def __init__(self):
        Driver.__init__(self, 'ADBPirelli1000')
        self.user_re     = _user_re
        self.password_re = _password_re
        self.prompt_re   = _prompt_re
        self.login_error_re = _login_fail_re
        self.error_re    = _error_re

    def check_head_for_os(self, string):
        if 'emote management disabled' in string.lower():
            return 88
        if _user_re[0].search(string):
            return 70
        return 0
egroeper commented 9 years ago

Can you provide a minimal code example? I don't use telnet, but that shouldn't be the problem, I suppose.

Here is an adapted version of simple.py, that works for me:

from Exscript.util.interact import read_login
from Exscript.protocols import SSH2
from Exscript.protocols.Exception import TimeoutException

account = read_login()

conn = SSH2()
conn.connect('<target host>')
try:
    conn.login(account)
except TimeoutException:
    print "blubb"
else:
    conn.execute('ls -l')

    print "Response was:", repr(conn.response)

    conn.send('exit\r')
    conn.close()

Results (with a remote host not showing a usual prompt):

python simple.py 
Please enter your user name [testuser]:
Please enter your password: 
blubb
egroeper commented 9 years ago

With Telnet it works, too.

Emulating dumb telnet (not responding to anything, but accepting connections):

sudo nc -l 127.0.0.1 23 &

Demo code:

from Exscript.util.interact import read_login
from Exscript.protocols import Telnet
from Exscript.protocols.Exception import TimeoutException

account = read_login()

conn = Telnet()
conn.connect('127.0.0.1')
try:
    conn.login(account)
except TimeoutException:
    print "blubb"
else:
    conn.execute('ls -l')

    print "Response was:", repr(conn.response)

    conn.send('exit\r')
    conn.close()

Output:

python simple.py 
Please enter your user name [testuser]:
Please enter your password: 
blubb
braincrash commented 9 years ago

Hi,

But I'm using start()...I can try your method... (...) Ok, I've tried, the problem is on start() it's not passing back the Exception :(

I'm using this for logging the output, and in the future I'm might use Telnet and SSH.

start(account, host, cmds_to_execute, stdout=fout)

So its better this method for the URI advantage. Can you see if the start() function is ok?

braincrash commented 8 years ago

?

egroeper commented 8 years ago

Sorry for the late reply. Perhaps it's nevertheless interesting for you or somebody else.

I just took some time and digged down the code. It's not possible to catch exceptions, if you use the convenience methods (start() and so on). I don't see a meaningful way to change this. When using these methods you are using kind of exscript high level API which then cares about parallelisation (threading) and exceptions. If you care about the exception info, I could think about extending Queue to allow you to define a callback function, that would get called. Would that help you?

The details: Using _ChildWatcher every Job is run in a separate Thread and _ChildWatcher catches all exceptions, collects their exception info and passes them to workqueue.MainLoop._on_job_completed, which passes them to the job_error_event of the WorkQueue, that calls Queue._on_job_error.

runborg commented 8 years ago

Hi!

I see this is an old thread, but, as a hotfix it is possible to use the _on_job_error routine inside Queue to collect info about errors on devices. I've attached an example:

This example creates an new class that extends Exscript.Queue and catches _on_job_error before running the original Queue._on_job_error. I've used this code to collect devices that fail when collecting info from them..

The example is runnable as-is.. :)

from Exscript                import Account, Host, Queue
from Exscript.util.decorator import autologin
from Exscript.util.interact  import read_login

class TraceQueue(Queue):
  errors = dict()
  def _on_job_error(self, job, exc_info):
    self.errors[job.name] = exc_info
    super(TraceQueue, self)._on_job_error(job, exc_info)

runconfig = dict()

def run_on_host(job, host, conn):
  conn.execute('term len 0')
  conn.execute('show run')
  runconfig[conn.get_host()] = conn.response

hosts = [Host("router1", default_protocol='ssh')]
default_account = read_login()

queue = TraceQueue(max_threads=10)
queue.add_account(default_account)
queue.run(hosts, autologin()(run_on_host))
queue.destroy()

print "--------------"
print "Failed devices %s" % len(queue.errors)
for dev in queue.errors:
  print "%30s %30s" % (dev, queue.errors[dev][1])
print "--------------"
egroeper commented 8 years ago

@runborg You are welcome! Unfortunately your solution really only seems like a hack to me. I won't Especially accessing the queue after calling queue.destroy() seems to be against @knipknap's intention when writing the code.

I will try to add an optional callback to Queue, which gets the exception info.

egroeper commented 8 years ago

@runborg, @braincrash Please have a look at PR #121. This should enable a clean solution to your problem.

The following could be used to collect exceptions:

#!/usr/bin/env python
from Exscript.util.match    import any_match
from Exscript.util.template import eval_file
from Exscript.util.start    import quickstart

exceptions = {}

def collect_exceptions(jobname, exc_info):
    exceptions[jobname] = exc_info

def do_something(job, host, conn):
    conn.execute('ls -1')
    files = any_match(conn, r'(\S+)')
    print "Files found:", files

# Open a connection (Telnet, by default) to each of the hosts, and run
# do_something(). To open the connection via SSH, you may prefix the
# hostname by the protocol, e.g.: 'ssh://hostname', 'telnet://hostname',
# etc.
quickstart(('localhost', 'otherhost'), do_something, exc_cb = collect_exceptions)

print "--------------"
print "Failed devices %s" % len(exceptions)
for dev in exceptions:
  print "%30s %30s" % (dev, exceptions[dev][1])
print "--------------"
runborg commented 8 years ago

@egroeper I agree, my solution is only a hack! But it works vell as an example on how it could be done. But your code in the PR is much more cleaner and fully integrated into the library! what about adding your example code as an example in the example directory?