napalm-automation / napalm-iosxr

Apache License 2.0
10 stars 23 forks source link

device connects successfully but methods give timeout error #151

Closed wasabi222 closed 7 years ago

wasabi222 commented 7 years ago

Description of Issue/Question

I'm able to connect to a device successfully but when I issue a method like get_bgp_neighbors or get_facts I get a timeout error. The connection to the device remains active though.

Did you follow the steps from https://github.com/napalm-automation/napalm#faq

[x ] Yes [ ] No

Setup

napalm-iosxr version

(Paste verbatim output from pip freeze | grep napalm-iosxr between quotes below)

napalm-iosxr==0.5.4

IOS-XR version and platform details

(Paste the complete verbatim output from show version brief between quotes below)

#sh version brief
Wed Oct 11 16:18:04.270 EDT

Cisco IOS XR Software, Version 5.1.3[Default]
Copyright (c) 2016 by Cisco Systems, Inc.

ROM: System Bootstrap, Version 0.73(c) 1994-2012 by Cisco Systems,  Inc.

gw1.fnc1 uptime is 4 weeks, 1 day, 8 hours, 50 minutes
System image file is "disk0:asr9k-os-mbi-5.1.3.sp10-1.0.0/0x100305/mbiasr9k-rsp3.vm"

cisco ASR9K Series (Intel 686 F6M14S4) processor with 6291456K bytes of memory.
Intel 686 F6M14S4 processor at 2127MHz, Revision 2.174
ASR 9006 AC Chassis with PEM Version 2

2 Management Ethernet
2 TenGigE
2 DWDM controller(s)
2 WANPHY controller(s)
503k bytes of non-volatile configuration memory.
6111M bytes of hard disk.
12510192k bytes of disk0: (Sector size 512 bytes).
12510192k bytes of disk1: (Sector size 512 bytes).

Steps to Reproduce the Issue

Python 2.7.9 (default, Jun 29 2016, 13:08:31)
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import napalm
>>> dr = napalm.get_network_driver('iosxr')
>>> gw = dr('dev', 'user', 'passwd')
>>> gw.open()
>>> n = gw.get_bgp_neighbors()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/napalm_iosxr/iosxr.py", line 351, in get_bgp_neighbors
    result_tree = ETREE.fromstring(self.device.make_rpc_call(rpc_command))
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 151, in make_rpc_call
    result = self._execute_rpc(rpc_command)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 365, in _execute_rpc
    response = self._send_command(xml_rpc_command, delay_factor=delay_factor)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 301, in _send_command
    if not self._timeout_exceeded(start=start):
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 190, in _timeout_exceeded
    raise TimeoutError(msg, self)
pyIOSXR.exceptions.TimeoutError: Timeout exceeded!
>>> gw.close()
>>> gw.open()
>>> facts = gw.get_facts()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/napalm_iosxr/iosxr.py", line 193, in get_facts
    facts_rpc_reply = ETREE.fromstring(self.device.make_rpc_call(facts_rpc_request))
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 151, in make_rpc_call
    result = self._execute_rpc(rpc_command)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 365, in _execute_rpc
    response = self._send_command(xml_rpc_command, delay_factor=delay_factor)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 301, in _send_command
    if not self._timeout_exceeded(start=start):
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 190, in _timeout_exceeded
    raise TimeoutError(msg, self)
pyIOSXR.exceptions.TimeoutError: Timeout exceeded!
>>> gw.is_alive()
{u'is_alive': True}
>>>

Error Traceback

(Paste the complete traceback of the exception between quotes below)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/napalm_iosxr/iosxr.py", line 193, in get_facts
    facts_rpc_reply = ETREE.fromstring(self.device.make_rpc_call(facts_rpc_request))
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 151, in make_rpc_call
    result = self._execute_rpc(rpc_command)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 365, in _execute_rpc
    response = self._send_command(xml_rpc_command, delay_factor=delay_factor)
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 301, in _send_command
    if not self._timeout_exceeded(start=start):
  File "/usr/local/lib/python2.7/dist-packages/pyIOSXR/iosxr.py", line 190, in _timeout_exceeded
    raise TimeoutError(msg, self)
pyIOSXR.exceptions.TimeoutError: Timeout exceeded!
mirceaulinic commented 7 years ago

Hi @wasabi222 - thanks for reporting this.

I am not sure if this is a IOS-XR specific problem, or general netmiko's - I think the device drops the connection after a while (I've seen the same on IOS, without the is_alive check). But it is bizarre however that is_alive still returns True. Would you have any NXOS or IOS device to connect via SSH/netmiko and confirm if they still cause the same, but it tells the connection continues to be alive?

wasabi222 commented 7 years ago

Hi Mircea,

The above error happens immediately after opening a connection to an iosxr device, so I'm not sure it's related to a connection timeout after an extended duration.

I was able to test with napalm_ios connecting to an ios device, and was able to successfully issue methods like get_facts, etc without issue.

mirceaulinic commented 7 years ago

Interesting... can you confirm the user is still connected (show users)? It might be just because it takes very long for the XML API to reply. Some time ago someone told me that it took like 8 minutes (yeah, that's mental) to reply to a simple request.

Can you enter in xml mode (xml echo format) and insert the following XML to see if the device responds slowly:

<?xml version="1.0" encoding="UTF-8"?><Request MajorVersion="1" MinorVersion="0">
<Get><Operational><SystemTime/><PlatformInventory/></Operational></Get>
</Request>

Or (if you have the same issue with get_lldp_neighbors or get_lldp_neighbors_detail):

<?xml version="1.0" encoding="UTF-8"?><Request MajorVersion="1" MinorVersion="0">
<Get><Operational><LLDP></LLDP></Operational></Get>
</Request>
mirceaulinic commented 7 years ago

Moved to https://github.com/napalm-automation/napalm/issues/433

@wasabi222 meanwhile, can you please clarify my question above?

wasabi222 commented 7 years ago

Hi Mircea,

I ran

<?xml version="1.0" encoding="UTF-8"?><Request MajorVersion="1" MinorVersion="0">
<Get><Operational><SystemTime/><PlatformInventory/></Operational></Get>
</Request>

It looks like the response in xml mode took somewhere close to 30 seconds to receive a reply, which could be causing a timeout on netmiko waiting for a reply from the device (Timeout Exceeded!)? I checked and show users does still show me connected.