Uninett / nav

Network Administration Visualized
GNU General Public License v3.0
182 stars 38 forks source link

[BUG] NAV will refuse to identify LLDP remote port names that contain trailing NUL bytes #2215

Closed hawken93 closed 3 years ago

hawken93 commented 3 years ago

Hi,

I'm having a problem with version 5.0.8 / python 3.7 with a cisco nexus5548 device. ipdevpoll uses the lldp plugin but the interface names are all appended with tons of '\x00's. I wonder if this is some regression because the same device seems to be able to build topology on a 5.0.5 installation with python 3.5. (EDIT: or maybe the topology on the old installation has not been updated since 4 series? I don't know)

I cannot easily share output, but some key information is:

[WARNING plugins.lldp.lldpneighbor] [topo nexus.example] cannot search database for malformed neighboring port name 'ethernet1/1/61:3\x00\x00\x00\x00\x00\x00...' Django then throws an exception builtins.ValueError: 'A string literal cannot contain NUL (0x00) characters.'

This change did not fix all of the problems, but it fixed the warnings. The fatal django exception still happens. I don't know the codebase well enough to fix the 0x00 bytes when the data initially is pulled from the device.

--- a/python/nav/ipdevpoll/neighbor.py
+++ b/python/nav/ipdevpoll/neighbor.py
@@ -212,6 +212,9 @@ class Neighbor(object):
         if not (self.netbox and name):
             return

+        # sometimes there are lots of null bytes..
+        name = name.rstrip('\x00')
+
         if is_invalid_database_string(name):
             self._logger.warning("cannot search database for malformed "
                                  "neighboring port name %r", name)

To Reproduce Use this device with python3.7 (debian buster) and release 5.0.8

Expected behavior I think the null bytes should be trimmed as the data is pulled from snmp

Environment (please complete the following information):

hawken93 commented 3 years ago

Same as #2176. I'll try out what was mentioned in https://github.com/Uninett/nav/issues/2176#issuecomment-698868594 and get back to you :)

lunkwill42 commented 3 years ago

This might be a regression due to updated handling of LLDP data in the 5.0 series - but a full traceback of the ValueError would be very helpful to identify the problem...

hawken93 commented 3 years ago

I've been able to get some logs out :)

lldp-nul-traceback.txt

lunkwill42 commented 3 years ago

Dumping the contents of the log inline here, it's short enough that having it here is quicker for reference...

__ = redacted

2020-11-26 11:05:49,411 [DEBUG plugins.lldp.lldp] [topo nexus.example] LLDP neighbors:
 [LLDPNeighbor(ifindex=436240384, chassis_id=macAddress('__:__:__:__:__:__'), port_id=interfaceName('47\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='not advertised', sysname='__'),
 LLDPNeighbor(ifindex=436244480, chassis_id=macAddress('__:__:__:__:__:__'), port_id=interfaceName('47\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='not advertised', sysname='__'),
 LLDPNeighbor(ifindex=436248576, chassis_id=macAddress('__:__:__:__:__:__'), port_id=local('Eth1/21\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='Ethernet1/21', sysname='___'),
 LLDPNeighbor(ifindex=436252672, chassis_id=macAddress('__:__:__:__:__:__'), port_id=local('Eth1/21\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='Ethernet1/21', sysname='__'),
 LLDPNeighbor(ifindex=436256768, chassis_id=macAddress('__:__:__:__:__:__'), port_id=local('Eth1/13\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='_______', sysname='__'),
 LLDPNeighbor(ifindex=436260864, chassis_id=macAddress('__:__:__:__:__:__'), port_id=local('Eth1/14\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'), port_desc='_______', sysname='__'),

2020-11-24 19:13:20,051 [ERROR jobs.jobhandler] [topo nexus.example] Plugin nav.ipdevpoll.plugins.lldp.LLDP('nexus.example') reported an unhandled failure
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/twisted/internet/defer.py", line 500, in errback
    self._startRunCallbacks(fail)
  File "/usr/local/lib/python3.7/dist-packages/twisted/internet/defer.py", line 567, in _startRunCallbacks
    self._runCallbacks()
  File "/usr/local/lib/python3.7/dist-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/usr/local/lib/python3.7/dist-packages/twisted/internet/defer.py", line 1442, in gotResult
    _inlineCallbacks(r, g, deferred)
--- <exception caught here> ---
  File "/usr/local/lib/python3.7/dist-packages/twisted/internet/defer.py", line 1384, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/usr/local/lib/python3.7/dist-packages/twisted/python/failure.py", line 408, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/plugins/lldp.py", line 85, in handle
    yield run_in_thread(self._process_remote)
  File "/usr/local/lib/python3.7/dist-packages/twisted/python/threadpool.py", line 250, in inContext
    result = inContext.theWork()
  File "/usr/local/lib/python3.7/dist-packages/twisted/python/threadpool.py", line 266, in <lambda>
    inContext.theWork = lambda: context.call(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.7/dist-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python3.7/dist-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/db.py", line 100, in _reset
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/plugins/lldp.py", line 155, in _process_remote
    neighbors = [LLDPNeighbor(lldp) for lldp in self.remote]
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/plugins/lldp.py", line 155, in <listcomp>
    neighbors = [LLDPNeighbor(lldp) for lldp in self.remote]
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/neighbor.py", line 117, in __init__
    self.identify()
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/neighbor.py", line 120, in identify
    self.netbox = self._identify_netbox()
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/plugins/lldp.py", line 227, in _identify_netbox
    netbox = lookup(str(chassid))
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/plugins/lldp.py", line 238, in _netbox_from_local
    info_set__value=str(chassid),
  File "/usr/local/lib/python3.7/dist-packages/nav/ipdevpoll/neighbor.py", line 192, in _netbox_query
    netbox = manage.Netbox.objects.values('id', 'sysname').get(query)
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 374, in get
    num = len(clone)
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 232, in __len__
    self._fetch_all()
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 1121, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/query.py", line 106, in __iter__
    for row in compiler.results_iter(chunked_fetch=self.chunked_fetch):
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 841, in results_iter
    results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch)
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 899, in execute_sql
    raise original_exception
  File "/usr/local/lib/python3.7/dist-packages/django/db/models/sql/compiler.py", line 889, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
builtins.ValueError: A string literal cannot contain NUL (0x00) characters.
lunkwill42 commented 3 years ago

I think you found a very appropriate place to strip trailing NULs, we'll take it :)