Uninett / nav

Network Administration Visualized
GNU General Public License v3.0
187 stars 38 forks source link

ipdevpoll statuscheck errors out on some Aruba switches #882

Closed jmbredal closed 7 years ago

jmbredal commented 7 years ago

We are getting repeated errors from the ipdevpoll statuscheck job concerning a customer's Aruba switches:

2016-06-14 08:44:34,135 [ERROR jobs.jobhandler] [statuscheck aruba-sw.example.org] Caught exception during save. Last manager = EntityManager(<class 'nav.ipdevpoll.shadows.entity.NetboxEntity'>, 'ContainerRepository'(...)). Last model = <class 'nav.ipdevpoll.shadows.entity.NetboxEntity'> Traceback (most recent call last):   File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/jobs.py", line 418, in perform_save     manager.save()   File "/usr/lib/python2.7/dist-packages/django/db/transaction.py", line 394, in inner     return func(*args, **kwargs)   File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/shadows/entity.py", line 49, in save     self._delete_missing()   File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/shadows/entity.py", line 73, in _delete_missing     to_purge = self.get_purge_list()   File "/usr/lib/python2.7/dist-packages/nav/ipdevpoll/shadows/entity.py", line 99, in get_purge_list     self._logger.warning(   File "/usr/lib/python2.7/dist-packages/networkx/algorithms/traversal/depth_first_search.py", line 101,  in dfs_tree     T.add_edges_from(dfs_edges(G,source))   File "/usr/lib/python2.7/dist-packages/networkx/classes/digraph.py", line 552, in add_edges_from     for e in ebunch:   File "/usr/lib/python2.7/dist-packages/networkx/algorithms/traversal/depth_first_search.py", line 61, in dfs_edges     stack = [(start,iter(G[start]))]   File "/usr/lib/python2.7/dist-packages/networkx/classes/graph.py", line 319, in getitem     return self.adj[n] KeyError: <NetboxEntity: Chassis (Unnamed entity) at aruba-sw>

It seems this problem arises because the Aruba switches do not present with any chassis entities in ENTITY-MIB::entPhysicalTable, but as a stack within a stack (although the switch really isn't stacked).

This has unintended consequences for the system plugin, which, since it cannot see that any chassis devices have been found, creates a new one to store a collected software version in. The system plugin does not run during a statuscheck job, however, so the algorithm that checks for missing devices fails.


Imported from Launchpad using lp2gh.

jmbredal commented 7 years ago

(by mbrekkevold) Although it isn't really clear to me what to do with devices that claim to have no chassis at all, it seems we need to fix this problem in two ways:

First, the 'missing devices' algorithm should handle the case where a missing device is unknown to any plugins - but this will result in deleting it during a statuscheck.

Second, the system plugin should probably never create the chassis device in the first place, if it sees that the device has no chassis but at least one root entity.

jmbredal commented 7 years ago

(by mbrekkevold) fix here: https://nav.uninett.no/hg/nav/rev/f32d35345aab

jmbredal commented 7 years ago

Translated changeset references: https://nav.uninett.no/hg/nav/rev/f32d35345aab: 52b24c705ae5ca10e2232e3f76094d6fde2c120e