napalm-automation / napalm

Network Automation and Programmability Abstraction Layer with Multivendor support
Apache License 2.0
2.24k stars 552 forks source link

IOS-XR (timeout?) issue on get_bgp_neighbors (XR 5.3.4) #849

Closed jlixfeld closed 5 years ago

jlixfeld commented 5 years ago

Not sure what to make of this, but I suspect it's a timeout issue although I'm not sure if it's user error or something else. I've tried to pass --optional_args timeout=120 to extend it a bit, but that still seems like a long time to return results. We're connecting, as can be seen from the successful open(), connection_tests() and get_facts().

Looking at the device logs, the session is closed very shortly after the traceback is thrown, so it would seem to me it's not a keepalive issue closing the session.

I'm not finding any results searching for similar symptoms, so either no one is using napalm with XR 5.3.4, or this issue is something unique to me :)

Does anyone have some clue to lob over?

jlixfeld@peeringmanager:~$ napalm --debug --user rancid --password * --vendor iosxr 10.219.49.2 call get_bgp_neighbors
2018-10-25 17:56:07,634 - napalm - DEBUG - Starting napalm's debugging tool
2018-10-25 17:56:07,634 - napalm - DEBUG - Gathering napalm packages
2018-10-25 17:56:07,660 - napalm - DEBUG - napalm-base==1.0.0
2018-10-25 17:56:07,660 - napalm - DEBUG - napalm==2.3.3
2018-10-25 17:56:07,660 - napalm - DEBUG - get_network_driver - Calling with args: ('iosxr',), {}
2018-10-25 17:56:07,673 - napalm - DEBUG - get_network_driver - Successful
2018-10-25 17:56:07,674 - napalm - DEBUG - __init__ - Calling with args: (<class 'napalm.iosxr.iosxr.IOSXRDriver'>, '10.219.49.2', 'rancid'), {'password': '*******', 'timeout': 60, 'optional_args': {}}
2018-10-25 17:56:07,674 - napalm - DEBUG - __init__ - Successful
2018-10-25 17:56:07,674 - napalm - DEBUG - pre_connection_tests - Calling with args: (<napalm.iosxr.iosxr.IOSXRDriver object at 0x7f9cfa03db70>,), {}
2018-10-25 17:56:07,674 - napalm - DEBUG - open - Calling with args: (<napalm.iosxr.iosxr.IOSXRDriver object at 0x7f9cfa03db70>,), {}
2018-10-25 17:56:18,579 - napalm - DEBUG - open - Successful
2018-10-25 17:56:18,580 - napalm - DEBUG - connection_tests - Calling with args: (<napalm.iosxr.iosxr.IOSXRDriver object at 0x7f9cfa03db70>,), {}
2018-10-25 17:56:18,580 - napalm - DEBUG - get_facts - Calling with args: (<napalm.iosxr.iosxr.IOSXRDriver object at 0x7f9cfa03db70>,), {}
2018-10-25 17:56:29,645 - napalm - DEBUG - Gathered facts:
{
    "os_version": "5.3.4",
...
<snipped for brevity>
...
}
2018-10-25 17:56:29,651 - napalm - DEBUG - get_facts - Successful
2018-10-25 17:56:29,651 - napalm - DEBUG - method - Calling with args: (<napalm.iosxr.iosxr.IOSXRDriver object at 0x7f9cfa03db70>, 'get_bgp_neighbors'), {}
2018-10-25 17:56:29,651 - napalm - DEBUG - get_bgp_neighbors - Attempting to resolve method
2018-10-25 17:56:29,651 - napalm - DEBUG - get_bgp_neighbors - Attempting to call method with kwargs: {}
2018-10-25 17:57:52,386 - napalm - ERROR - method - Failed: Unable to process the XML Response from the device!

================= Traceback =================

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pyIOSXR/iosxr.py", line 368, in _execute_rpc
    root = ET.fromstring(str.encode(response))
  File "src/lxml/etree.pyx", line 3213, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1877, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1765, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1127, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 601, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 711, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 640, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/napalm", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.5/dist-packages/napalm_base/clitools/cl_napalm.py", line 285, in main
    run_tests(args)
  File "/usr/local/lib/python3.5/dist-packages/napalm_base/clitools/cl_napalm.py", line 268, in run_tests
    call_getter(device, args.method, **method_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/napalm_base/clitools/cl_napalm.py", line 27, in wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/napalm_base/clitools/cl_napalm.py", line 238, in call_getter
    r = func(**kwargs)
  File "/usr/local/lib/python3.5/dist-packages/napalm/iosxr/iosxr.py", line 354, in get_bgp_neighbors
    result_tree = ETREE.fromstring(self.device.make_rpc_call(rpc_command))
  File "/usr/local/lib/python3.5/dist-packages/pyIOSXR/iosxr.py", line 151, in make_rpc_call
    result = self._execute_rpc(rpc_command)
  File "/usr/local/lib/python3.5/dist-packages/pyIOSXR/iosxr.py", line 372, in _execute_rpc
    raise InvalidXMLResponse('Unable to process the XML Response from the device!', self)
pyIOSXR.exceptions.InvalidXMLResponse: Unable to process the XML Response from the device!
jlixfeld@peeringmanager:~$
jlixfeld commented 5 years ago

For the archives, this was due to a few different things. Mainly, I suspect a bug in XR 5.3.4 where the output was too large and required streaming to be enabled. To throw a bunch of poo against the wall to see what stuck while debugging this, I also maxed out the throttle memory, but I didn't test to see if removing the memory throttle but leaving the streaming would have any affect. YMMV.

!
xml agent tty
 throttle memory 600
 iteration off
 streaming on size 48
!

Lastly, the output that was finally produced when running this on the XR CLI was so incredibly huge, I have to think that there is some sort of XR bug that created a loop of the same XML output. I saw the same data stream many tens of times in the output before I finally went ^C on it after over 5 minutes of it spewing whatever it was spewing.

This will produce the loop:

RP/0/RSP0/CPU0:bfr01.905KingStW01.YYZ#xml format echo
Sat Oct 27 10:51:48.007 EDT
XML><?xml version="1.0" encoding="UTF-8"?>
<Request MajorVersion="1" MinorVersion="0">
    <Get>
        <Operational>
            <BGP>
                <ConfigInstanceTable>
                    <ConfigInstance>
                        <Naming>
                            <InstanceName>
                                default
                            </InstanceName>
                        </Naming>
                        <ConfigInstanceVRFTable>

                        </ConfigInstanceVRFTable>
                    </ConfigInstance>
                </ConfigInstanceTable>
            </BGP>
        </Operational>
    </Get>
</Request>

When limiting it only to the VRF you're interested in, it works fine.

XML> <?xml version="1.0" encoding="UTF-8"?>
<Request MajorVersion="1" MinorVersion="0">
    <Get>
        <Operational>
            <BGP>
                <InstanceTable>
                    <Instance>
                        <Naming>
                            <InstanceName>
                                default
                            </InstanceName>
                        </Naming>
                        <InstanceActive>
                            <VRFTable>
                                <VRF>
                                    <Naming>
                                        Inetv4
                                    </Naming>
                                    <GlobalProcessInfo>

                                    </GlobalProcessInfo>
                                    <NeighborTable>

                                    </NeighborTable>
                                </VRF>
                            </VRFTable>
                        </InstanceActive>
                    </Instance>
                </InstanceTable>
            </BGP>
        </Operational>
    </Get>
</Request>
<?xml version="1.0" encoding="UTF-8"?>
<Response MajorVersion="1" MinorVersion="0">
    <Get>
...
...
...
    </Get>
    <ResultSummary ErrorCount="0"/>
</Response>
XML>
beufanet commented 5 years ago

How did you to show only default VRF results using napalm ?

ktbyers commented 5 years ago

@beufanet You might want to ask a question in the slack channel: https://github.com/napalm-automation/napalm#slack

Regards,

Kirk

jlixfeld commented 5 years ago

@beufanet I never specifically attempted to show the default VRF, but I believe the first example above would provide either the default VRF or both the default VRF and any non-default VRFs.

jbotello7381 commented 4 years ago

I end here, after looking for some answers on the same error. @jlixfeld are saying the issue was solved by modifying xr xml configuration? with the following config?

!
xml agent tty
 throttle memory 600
 iteration off
 streaming on size 48
!
jlixfeld commented 4 years ago

@jbotello7381 I specifically recall, but I don't believe so, no. There was never any real attempt made to resolve the issue for the default VRF. My interest was only in having NAPALM return results for one specific VRF, and I had to hack NAPALM in order to achieve that.

When coupled with this patch for NAPALM 2.3.3...

BlackBox:napalm jlixfeld$ git diff
diff --git a/napalm/iosxr/iosxr.py b/napalm/iosxr/iosxr.py
index 491bf0ce..5d39858f 100644
--- a/napalm/iosxr/iosxr.py
+++ b/napalm/iosxr/iosxr.py
@@ -345,7 +344,7 @@ class IOSXRDriver(NetworkDriver):
         because bulk-getting all instance-data to do the same could get ridiculously heavy
         Assuming we're always interested in the DefaultVRF
         """
-
+        """
         active_vrfs = ["global"]

         rpc_command = '<Get><Operational><BGP><ConfigInstanceTable><ConfigInstance><Naming>\
@@ -357,7 +356,9 @@ class IOSXRDriver(NetworkDriver):
         for node in result_tree.xpath('.//ConfigVRF'):
             active_vrfs.append(napalm.base.helpers.find_txt(node, 'Naming/VRFName'))

+        """
         result = {}
+        active_vrfs = ["Inetv4"]

         for vrf in active_vrfs:
             rpc_command = generate_vrf_query(vrf)
@@ -912,6 +913,7 @@ class IOSXRDriver(NetworkDriver):

         bgp_neighbors_detail = {}

+        '''
         active_vrfs = ['default']

         active_vrfs_rpc_request = '<Get><Operational><BGP><ConfigInstanceTable><ConfigInstance>\
@@ -926,6 +928,8 @@ class IOSXRDriver(NetworkDriver):
             active_vrfs.append(napalm.base.helpers.find_txt(active_vrf_tree, 'Naming/VRFName'))

         unique_active_vrfs = sorted(set(active_vrfs))
+        '''
+        unique_active_vrfs = ['Inetv4']

         bgp_neighbors_vrf_all_rpc = '<Get><Operational><BGP><InstanceTable><Instance><Naming>\
         <InstanceName>default</InstanceName></Naming>'
@@ -1081,7 +1085,7 @@ class IOSXRDriver(NetworkDriver):
                     'advertised_prefix_count': advertised_prefix_count,
                     'flap_count': flap_count
                 })
-        bgp_neighbors_detail['global'] = bgp_neighbors_detail.pop('default')
+        #bgp_neighbors_detail['global'] = bgp_neighbors_detail.pop('default')
         return bgp_neighbors_detail

     def get_arp_table(self):
BlackBox:napalm jlixfeld$

... this XR config was sufficient:

!
xml agent tty
 iteration off
!

However, YMMV as this will not patch cleanly on > NAPALM 2.3.3.