networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
1.99k stars 349 forks source link

snmp-ups: load_mib2nut: testOID provided and doesn't match MIB 'apcc'! #700

Open jjakob opened 5 years ago

jjakob commented 5 years ago

Two configured snmp ups's with identical configuration other than the name, IP address and passwords. One always starts fine, the other never starts. I've tried exchanging their order in ups.conf, commenting out the mibs=apcc line, no change. Started after a recent apt-get dist-upgrade, before I've never had problems.

[snmpups01]
        driver = snmp-ups
        mibs = apcc
        port = 10.x.x.y
        snmp_version = v3
        secLevel = authPriv
        secName = authpriv
        authPassword = redacted
        privPassword = redacted
        desc = "APC UPS02 SNMP"

[snmpups02]
        driver = snmp-ups
        mibs = apcc
        port = 10.x.x.z
        snmp_version = v3
        secLevel = authPriv
        secName = authpriv
        authPassword = redacted
        privPassword = redacted
        desc = "APC UPS01 SNMP"
/lib/nut/snmp-ups -a snmpups01 -DD
Network UPS Tools - Generic SNMP UPS driver 0.98 (2.7.4.1)
   0.000000     debug level is '2'
   0.008755     SNMP UPS driver: entering upsdrv_initups()
   0.008802     SNMP UPS driver: entering nut_snmp_init(snmp-ups)
   0.017236     Setting SNMP retries to 5
   0.017269     Setting SNMP timeout to 1 second(s)
   0.054450     SNMP UPS driver: entering load_mib2nut(apcc)
   0.054472     load_mib2nut: trying classic method with 'apcc' mib
   0.054476     Testing ups.model using OID .1.3.6.1.4.1.318.1.1.1.1.1.1.0
   1.947277     load_mib2nut: testOID provided and doesn't match MIB 'apcc'!
   1.947331     Unknown mibs value: apcc
upsd -V
Network UPS Tools upsd 2.7.4.1

apt-cache policy nut-server 
nut-server:
  Installed: 2.7.4-0ubuntu7~xenial~libusb1~gb0e1758
  Candidate: 2.7.4-0ubuntu7~xenial~libusb1~gb0e1758
  Version table:
 *** 2.7.4-0ubuntu7~xenial~libusb1~gb0e1758 500
        500 http://ppa.launchpad.net/clepple/nut/ubuntu xenial/main amd64 Packages
        100 /var/lib/dpkg/status
jjakob commented 5 years ago

Fixed by restarting the UPS management card. I have no idea what was wrong as it was accessible fine via the web GUI and pingable. The error didn't indicate anything was wrong with the communication. Perhaps the error checking should be more concise in what's going on.

clepple commented 5 years ago

You really shouldn't trust the guy who put together that PPA ;-) (I was wondering why upsd from a .deb was saying 2.7.4.1...)

When you say it happened after a recent apt-get dist-upgrade, did the nut-server package get upgraded then (i.e. you added the PPA)? Did the SNMP packages get upgraded?

A lot of the snmp-ups driver has been rewritten since that version. I am inclined to say that we need a bit more information to know what exactly went wrong, and therefore, what should trigger a better error message. If I had to guess, it looks like NUT contacted the UPS, but it returned something unexpected. On the other hand, I don't think we have enough information to say whether it is a bug in the SNMP module firmware or in NUT. (As you can imagine, it would be helpful for developers to be able to reproduce this.)

If this happens again, I would recommend grabbing some of the snmpwalk info from the How to make a new subdriver... section of the developer manual ("mode 2: get data from files") before restarting the SNMP module.

@aquette or @jimklimov, any other thoughts?

jjakob commented 5 years ago

The dist-upgrade was a red herring, as I've added that PPA a long time ago and it worked fine before. What I think happened was that after the dist-upgrade, the server was restarted and tried to reconnect to the UPS management card, which started behaving unexpectedly (triggered by the reconnect). Before restarting it, I tried purging the PPA and downgrading to the distro packages (with ppa-purge, vers. 2.7.2-4ubuntu1.2). The errors were still present so I thought about restarting the NMC, which fixed it - after that I reinstalled the PPA packages to test them, and it still worked. I'll be on the lookout if/when this reoccurs to do the snmpwalk. I'm not that optimistic since this is the first time it happened in years. Are you aware of UPS shutdown not working with this NMC (AP9617) over SNMP? I think I was having a problem with killpower not being sent to the UPS so it never shut down, which is why I was looking for an upgraded version in the PPAs. I'm still not sure if it works, I'd need to do more testing.