Closed rdukes6514 closed 2 months ago
Hello, sorry to hear about the issue.
Is it possible for you to try running directly the NUT v2.7.4 and newer driver programs with elevated debug verbosity, to see at what point the driver decides it can or can not connect to the device? See https://github.com/networkupstools/nut/wiki/Changing-NUT-daemon-debug-verbosity for details.
Also I wonder if your drivers actually got started as automatically created service instances, hence the attempts with upsdrvctl
fail (device is busy) - see https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE)
Hi Jim, and thanks.
I am currently running a Pi Zero 2 with RaspiOS Bookworm installed. The available NUT claims to be 2.8.0.
Thanks to your response , I finally figured out where those -D's go...
It looks like Debian automatically creates the services as noted by the following section of the install:
... Setting up nut-server (2.8.0-7) ... Created symlink /etc/systemd/system/nut.target.wants/nut-driver-enumerator.path → /lib/systemd/system/nut-driver-enum erator.path. Created symlink /etc/systemd/system/nut.target.wants/nut-driver-enumerator.service → /lib/systemd/system/nut-driver-e numerator.service. Created symlink /etc/systemd/system/nut.target.wants/nut-driver.target → /lib/systemd/system/nut-driver.target. Created symlink /etc/systemd/system/nut.target.wants/nut-server.service → /lib/systemd/system/nut-server.service. ...
And the fact they turn up in /etc/systemd/system...
... root@newups-l:~# find / -name nut-server* /etc/systemd/system/network-online.target.wants/nut-server.service /etc/systemd/system/nut.target.wants/nut-server.service /etc/systemd/system/multi-user.target.wants/nut-server.service /etc/init.d/nut-server /etc/nut/nut-server.service.apc /usr/lib/systemd/system/nut-server.service /usr/lib/systemd/system/nut-server.service.orig /usr/lib/tmpfiles.d/nut-server.conf /usr/share/doc/nut-server /usr/share/lintian/overrides/nut-server /sys/fs/cgroup/system.slice/nut-server.service /var/lib/systemd/deb-systemd-helper-enabled/nut-server.service.dsh-also /var/lib/systemd/deb-systemd-helper-enabled/nut.target.wants/nut-server.service /var/lib/dpkg/info/nut-server.prerm /var/lib/dpkg/info/nut-server.conffiles /var/lib/dpkg/info/nut-server.preinst /var/lib/dpkg/info/nut-server.list /var/lib/dpkg/info/nut-server.postinst /var/lib/dpkg/info/nut-server.postrm /var/lib/dpkg/info/nut-server.md5sums ...
And here is the Debug info:
... root@newups-l:~# systemctl daemon-reload root@newups-l:~# systemctl restart nut-server root@newups-l:~# systemctl status nut-server ● nut-server.service - Network UPS Tools - power devices information server Loaded: loaded (/lib/systemd/system/nut-server.service; enabled; preset: enabled) Active: active (running) since Thu 2024-04-04 11:40:35 CDT; 7s ago Main PID: 2912 (upsd) Tasks: 1 (limit: 380) CPU: 21ms CGroup: /system.slice/nut-server.service └─2912 /lib/nut/upsd -DDDDDD
Apr 04 11:40:42 newups-l upsd[2912]: 7.251318 [D3] Got PONG from UPS [heartbeat] Apr 04 11:40:42 newups-l upsd[2912]: 7.251384 [D3] sstate_dead: driver for UPS [UPS] says data is stale Apr 04 11:40:42 newups-l upsd[2912]: 7.251439 [D2] mainloop: polling 3 filedescriptors Apr 04 11:40:42 newups-l upsd[2912]: [D2] mainloop: no data available Apr 04 11:40:42 newups-l upsd[2912]: [D3] sstate_dead: driver for UPS [UPS] says data is stale Apr 04 11:40:42 newups-l upsd[2912]: [D3] Pinging UPS [heartbeat] Apr 04 11:40:42 newups-l upsd[2912]: [D2] mainloop: polling 3 filedescriptors Apr 04 11:40:42 newups-l upsd[2912]: [D3] Got PONG from UPS [heartbeat] Apr 04 11:40:42 newups-l upsd[2912]: [D3] sstate_dead: driver for UPS [UPS] says data is stale Apr 04 11:40:42 newups-l upsd[2912]: [D2] mainloop: polling 3 filedescriptors ...
I'm attempting to build from git, but I'm having to learn as I go...
Thanks again, Randall Dukes
Same story with a fresh install of Bookworm/Nut 2.8.0, followed by building and installing NUT 2.8.2 over it.
Is there a unit for nut-driver@UPS.service
? What does its log say about the device communications - why it goes stale?
I strongly suspect there's some weird fallout of warnings fixes (there were already a few places that de-facto relied on broken code) but so far lack a smoking gun...
FS objects:
root@newups-l:/# find / -name nut-driver@UPS.service
/etc/systemd/system/nut-driver.target.wants/nut-driver@UPS.service
/sys/fs/cgroup/system.slice/system-nut\x2ddriver.slice/nut-driver@UPS.service
Uncommented parts of /etc/systemd/system/nut-driver.target.wants/nut-driver@UPS.service
:
...
[Unit]
Description=Network UPS Tools - device driver for %I
After=local-fs.target
PartOf=nut-driver.target
Environment=NUT_IGNORE_NOWAIT=true
EnvironmentFile=-/etc/nut/nut.conf
SyslogIdentifier=%N
ExecStartPre=-/usr/bin/systemd-tmpfiles --create /usr/lib/tmpfiles.d/nut-common-tmpfiles.conf
ExecStart=/bin/sh -c 'NUTDEV="`/usr/lib/nut/nut-driver-enumerator.sh --get-device-for-service %i`" && [ -n "$NUTDEV" ] || { echo "FATAL: Could not find a NUT device section for service unit %i" >&2 ; exit 1 ; } ; exec /usr/sbin/upsdrvctl start "$NUTDEV"'
ExecReload=/bin/kill -USR1 $MAINPID
ExecStop=/bin/sh -c 'NUTDEV="`/usr/lib/nut/nut-driver-enumerator.sh --get-device-for-service %i`" && [ -n "$NUTDEV" ] || { echo "FATAL: Could not find a NUT device section for service unit %i" >&2 ; exit 1 ; } ; /usr/sbin/upsdrvctl stop "$NUTDEV"'
StartLimitInterval=0
Restart=always
RestartSec=15s
Type=forking
[Install]
WantedBy=nut-driver.target
...
/sys/fs/cgroup/system.slice/system-nut\x2ddriver.slice/nut-driver@UPS.service
----- can't be found
Unit status/log:
root@newups-l:/# systemctl status nut-driver@UPS.service
● nut-driver@UPS.service - Network UPS Tools - device driver for UPS
Loaded: loaded (/lib/systemd/system/nut-driver@.service; enabled; preset: enabled)
Drop-In: /etc/systemd/system/nut-driver@UPS.service.d
└─nut-driver-enumerator-generated-checksum.conf, nut-driver-enumerator-generated-devicename.conf
Active: active (running) since Mon 2024-04-08 09:09:06 CDT; 21h ago
Main PID: 886 (upscode2)
Tasks: 1 (limit: 380)
CPU: 17.690s
CGroup: /system.slice/system-nut\x2ddriver.slice/nut-driver@UPS.service
└─886 /lib/nut/upscode2 -a UPS
Apr 09 06:54:19 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:19 newups-l upscode2[886]: No contact with UPS, delaying init. Apr 09 06:54:21 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:21 newups-l upscode2[886]: No contact with UPS, delaying init. Apr 09 06:54:23 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:23 newups-l upscode2[886]: No contact with UPS, delaying init. Apr 09 06:54:25 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:25 newups-l upscode2[886]: No contact with UPS, delaying init. Apr 09 06:54:27 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:27 newups-l upscode2[886]: No contact with UPS, delaying init. Apr 09 06:54:29 newups-l upscode2[886]: Missing UPCL after UPCL Apr 09 06:54:29 newups-l upscode2[886]: No contact with UPS, delaying init.
Thanks.
Thanks, I've updated formatting in your post for "text screenshots" to be more readable.
The /etc/systemd/system/nut-driver.target.wants/nut-driver@UPS.service
seems like what NDE (nut-driver-enumerator) generates from the ups.conf
sections, so https://github.com/networkupstools/nut/wiki/nut%E2%80%90driver%E2%80%90enumerator-(NDE) wiki page's notes about its potential conflict with a manually-executed driver program or upsdrvctl
do apply. In short - if you want to run the driver manually, e.g. to experiment with debug or custom builds where you add more places for it to print debug info, be sure to temporarily stop the systemd unit. If you just want more debugging with the existing program (and in the context it has as a service), temporarily setting debug_min = 6
in its ups.conf
section can help: https://github.com/networkupstools/nut/wiki/Changing-NUT-daemon-debug-verbosity
The /sys/fs/cgroup/...
is an internal detail of how systemd manages and constrains the services/units, so not much of practical interest. Still, the "can't be found" part may be due to the backslash in \x
(should be two separate characters in the path name AFAIK) -- did you single/double-quote the pathname when you looked for it? Try escaping that slash as \\x
(otherwise it is interpreted as an escaped "x" character sequence).
Finally the service log is the interesting part and seeing how the driver lifetime started (more so with boosted debug verbosity) would be of interest for troubleshooting. So far I see just the initialization faults, retried every 2 seconds:
Apr 09 06:54:19 newups-l upscode2[886]: Missing UPCL after UPCL
Apr 09 06:54:19 newups-l upscode2[886]: No contact with UPS, delaying init.
You can look at existing unit journal via journalctl -lu nut-driver@UPS.service
or watch it scroll when you restart the service with e.g. journalctl -flu nut-driver@UPS.service & systemctl restart nut-driver@UPS.service
(both as root
).
I wonder if there is some permissions issue for the serial port devfs node (e.g. owned by nut
vs. nobody
or root
or dialout
etc. - more so if you rebuilt NUT without specifying same user/group/... settings as applied by packaging earlier), and/or if it has a different pathname after the OS upgrade? As a quick hack for permissions, you can try user=root
in the ups.conf
section for the device, so the driver does not drop privileges (potentially insecure, so if this is the problem -- better find a way to store/re-apply the needed ownership in devfs across reboots).
Jim,
Not sure how to do the formatting, but...
A new install of Bookworm on a Pi Zero 2w, was made and NUT (2.8.0/stable) was setup for an APC smart ups 620 using the apcsmart protocol, and I get a working system:
...
root@apc620:~# systemctl status nut-server
● nut-server.service - Network UPS Tools - power devices information server
Loaded: loaded (/lib/systemd/system/nut-server.service; enabled; preset: e>
Active: active (running) since Wed 2024-04-10 04:37:19 CDT; 1h 8min ago
Main PID: 1076 (upsd)
Tasks: 1 (limit: 380)
CPU: 401ms
CGroup: /system.slice/nut-server.service
└─1076 /lib/nut/upsd -F
Apr 10 04:37:20 apc620 upsd[1076]: Running as foreground process, not saving a >
Apr 10 04:37:20 apc620 nut-server[1076]: Running as foreground process, not sav>
Apr 10 04:37:20 apc620 nut-server[1076]: User upsmaster@127.0.0.1 logged into U>
Apr 10 04:37:20 apc620 upsd[1076]: User upsmaster@127.0.0.1 logged into UPS [UP>
Apr 10 04:37:20 apc620 nut-server[1076]: User upsmaster@127.0.0.1 logged into U>
Apr 10 04:37:20 apc620 upsd[1076]: User upsmaster@127.0.0.1 logged into UPS [he>
Apr 10 04:37:56 apc620 nut-server[1076]: Data for UPS [heartbeat] is stale - ch>
Apr 10 04:37:56 apc620 upsd[1076]: Data for UPS [heartbeat] is stale - check dr>
Apr 10 04:37:57 apc620 nut-server[1076]: UPS [heartbeat] data is no longer stale
Apr 10 04:37:57 apc620 upsd[1076]: UPS [heartbeat] data is no longer stale
...
If I take an identical image and setup for the upscode2 protocol (the only changes being in ups.conf for upscode2, with the added debug_min = 6) for a Powervar PRC1000-a/NS I get: Note results are about the same whether After=network-online.target and Requires=network.target have been uncommented or not (in /etc/systemd/system/nut.target.wants/nut-server.service).
...
● nut-server.service - Network UPS Tools - power devices information server
Loaded: loaded (/lib/systemd/system/nut-server.service; enabled; preset: enabled)
Active: active (running) since Wed 2024-04-10 05:38:37 CDT; 25s ago
Process: 1458 ExecReload=/sbin/upsd -c reload -P $MAINPID (code=exited, status=0/SUCCESS)
Main PID: 1447 (upsd)
Tasks: 1 (limit: 380)
CPU: 55ms
CGroup: /system.slice/nut-server.service
└─1447 /lib/nut/upsd -F -DDDDDD
Apr 10 05:39:00 powervar2 upsd[1447]: [D2] write: [destfd=6] [len=24] [VAR UPS ups.status "OL"]
Apr 10 05:39:00 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
Apr 10 05:39:00 powervar2 upsd[1447]: [D6] Entering check_command: GET
Apr 10 05:39:00 powervar2 upsd[1447]: [D6] check_command: Calling command handler for GET
Apr 10 05:39:00 powervar2 upsd[1447]: [D2] write: [destfd=8] [len=30] [VAR heartbeat ups.status "OB"]
Apr 10 05:39:00 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
Apr 10 05:39:02 powervar2 nut-server[1447]: 25.110534 [D3] Pinging UPS [heartbeat]
Apr 10 05:39:02 powervar2 nut-server[1447]: 25.110590 [D2] mainloop: polling 5 filedescriptors
Apr 10 05:39:02 powervar2 upsd[1447]: [D3] Pinging UPS [heartbeat]
Apr 10 05:39:02 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 nut-server[1447]: 103.711951 [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 nut-server[1447]: 104.021653 [D3] Pinging UPS [heartbeat]
Apr 10 05:40:21 powervar2 nut-server[1447]: 104.021713 [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 nut-server[1447]: 104.021760 [D3] Got PONG from UPS [heartbeat]
Apr 10 05:40:21 powervar2 nut-server[1447]: 104.021802 [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 upsd[1447]: [D3] Pinging UPS [heartbeat]
Apr 10 05:40:21 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
Apr 10 05:40:21 powervar2 upsd[1447]: [D3] Got PONG from UPS [heartbeat]
Apr 10 05:40:21 powervar2 upsd[1447]: [D2] mainloop: polling 5 filedescriptors
...
This kind of points right back to the upscode2 driver itself
Not sure how to do the formatting, but...
Surround by identical lines of 3 or 4 backticks.
As for the log, can you look at nut-driver@UPS.service
? The nut-server
is the upsd
data server which relays info from drivers to networked protocol.
Jim, Sorry, I had to step away.
After a couple of reboots the upscode2 driver appears to be working "correctly" with NUT 2.8.0. These are the changes made so far:
/usr/lib/systemd/system/nut-server.service a. Uncomment After=network-online.target and Requires=network.target. These appear to be uncommented by the enumerator for apcsmart, but not for upscode2. I think this was required for NUT 2.7.4 as well. b. Change "ExecStart=/sbin/upsd -F" to "ExecStart=/sbin/upsd -F -DDDDDD"
Add "debug_min = 6" to under upscode driver section of /etc/nut/ups.conf
Reboot. systemctl stop nut-server / systemctl daemon-reload / systemctl restart nut-server didn't appear to do the trick.
I have one more ups to update to to setup, so I'll start at square one to check the whole install process.
Thanks, Randall
Edit: Part of the problem was the rs232 level shifter I was using was checking in and out. replaced with a new one and both the Powervar UPSs are working. Both changes to nut-server.service and the change to ups.conf appear to be required to get NUT 2.8.0 upscode2 communicating.
Jim,
I think you mark this as solved.
Thanks for the updates. Note that normally you should remove (comment away) the debug_min=6
setting from config files when all began running okay, otherwise your syslog and associated storage would be needlessly stressed.
I have several older UPSes that communicate serially by either the apcsmart or upscode2 protocol. Each UPS is monitored by a Pi Zero or a PC, connected by rs232. All instances of NUT were fine under Debian or Raspbian Buster and Bullseye using NUT 2.7.4.
The one apcsmart UPS/PC, that has been updated to Bookworm/NUT 2.8.0, is working fine. However the upscode2 UPSes are not communicating. Both the webpage for upsstats.cgi and systemctl status nut-server report "Data for UPS [UPS] is stale - check driver". Restarting with upsdrvctl returns "Missing UPCL after UPCL".
This sound a lot like the PR #1455/#1495 regression issue. I cannot get this to run with anywhere from 1-6 -D's in ups.conf.
I have also tried to wget all the NUT files from Debian Trixie (nut 2.8.1-3 and install with dpkg, but I get the same results - Stale Data and the UPCL errors.
Help?
Thanks Randall Dukes