Uninett / nav

Network Administration Visualized
GNU General Public License v3.0
193 stars 39 forks source link

All ipdevpoll jobs, except `dns` and `snmpcheck`, stopped working after upgrade to 4.7.0 #1528

Closed Jo-Oiongen closed 7 years ago

Jo-Oiongen commented 7 years ago

We are running a NAV appliance that have been upgraded since the install using "apt-get update" >> "apt-get upgrade".

uname -a Linux navappliance 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2 (2017-04-30) x86_64 GNU/Linux

After upgrading yesterday all nav-jobs, except "dns" and "snmpcheck" have stopped working.

"nav status" gives:

Up: activeip alertengine eventengine ipdevpoll logengine mactrace maintengine navstats netbiostracker pping psuwatch servicemon smsd snmptrapd thresholdmon topology

While in the webgui the status is:

Dns             2017-06-09 13:39:40     0.02s   Success
Snmpcheck   2017-06-09 13:29:44     0.05s   Success
Ip2mac          2017-06-08 21:43:03     0.00s   Overdue
Statuscheck     2017-06-08 21:36:34     74.21s  Overdue
1minstats   2017-06-08 21:35:25     0.41s   Overdue
5minstats   2017-06-08 21:33:30     45.32s  Overdue
Topo            2017-06-08 21:32:49     11.67s  Overdue
Inventory       2017-06-08 19:17:41     334.55s Overdue

Please let me know if you need any additional information, extracts from logs etc.

lunkwill42 commented 7 years ago

The changes made to enable the new multiprocess mode moved from passing Netbox-objects to job handlers to just passing the netbox id (to avoid complex serialization of Netbox objects to child processes).

The loaded objects have meta-properties attached to them to indicate whether there are active snmpAgentState alerts, and to list the last run time of all job for the netbox. These meta-properties are lost as each JobHandler loads an entirely new Netbox instance from the database - which in turn causes every plugin to think that the Netbox is not currently responding to SNMP, and consequently ignoring it.

lunkwill42 commented 7 years ago

In case anyone is browsing this bug report and wondering: Yes, this will cause all your graphs to be blank, as no data is being collected.