networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
2.07k stars 350 forks source link

Warning: excessive poll failures adding APC NMC card #2582

Open Karolusin opened 3 months ago

Karolusin commented 3 months ago

Hello, I had nut instance running for around 3 months, when i decided to run apt-get upgrade on my proxmox container where nut was installed. Suddenly it stopped to detect my APC UPS's ( i have only APC ). I have tried to install whole container from scratch but it seems there is same error visible below. image

I am using snmp-ups driver to communicate with device as shown below. image

Snmp walk works without any issues so i have eliminated network issues. image

I believe there must be some update in last 3 months that has broken integration. I have found that there was simmilar issue fixed 5 years ago. https://github.com/networkupstools/nut/issues/743

jimklimov commented 3 months ago

Hello, sorry to hear about the inconveniences.

One thing that comes to mind from your log screenshot is how you use upsdrvctl start and it says "Terminating other driver". This is symptomatic of the nut-driver-enumerator (new since 2.8.0, but your updated container seems to ship just that) managing nut-driver@.service instances wrapped into systemd units individually for each driver configuration. See more details at https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE)

The short of it is that the current (2.8.x) NUT setups on modern Linux would not normally need upsdrvctl manual start-ups, but rather to manage systemd units (watch their journal logs, etc.) or use the provided upsdrvsvcctl script for similar experience but aware of a service management framework. A manually started driver would kill the systemd-wrapped one, then systemd revives its managed service and kills the one you started manually. Some reported rough edges of this approach were polished in later releases and there are some bits even in an eventually upcoming 2.8.3 too.

Per https://github.com/networkupstools/nut/issues/2308#issuecomment-1948224043 the poll failure message may be...

a false positive log, due to indeed non-existent OIDs. Can you please confirm with an snmpget ?