networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
1.73k stars 335 forks source link

upsmon child process PID stored in upsmon.pid #123

Open bigon opened 10 years ago

bigon commented 10 years ago

Hello,

When using systemd, it complains about the PID stored in the .pid file:

nut-monitor.service: Supervising process XXXX which is not our child. We'll most likely not notice when it exits.

And indeed when looking in upsmon.pid, the PID stored there is the one from the grand-child (unprivileged process) of the process started by init. Shouldn't this be the PID of the direct forked process instead?

aquette commented 10 years ago

2014-04-22 22:39 GMT+02:00 Laurent Bigonville notifications@github.com:

Hello,

Hi Laurent

When using systemd, it complains about the PID stored in the .pid file:

nut-monitor.service: Supervising process XXXX which is not our child. We'll most likely not notice when it exits.

And indeed when looking in upsmon.pid, the PID stored there is the one from the grand-child of the process started by init. Shouldn't this be the PID of the direct forked process instead?

look closer and read this FAQ entry: http://www.networkupstools.org/docs/FAQ.html#_why_are_there_two_copies_of_upsmon_running

that said, is there an "override" mechanism in systemd to avoid this unnecessary msg?

cheers, Arnaud

Engineering Linux/Unix Expert - Opensource Solutions Lead - Eaton - http://opensource.eaton.com NUT (Network UPS Tools) Project Leader - http://www.networkupstools.org Debian Developer - http://www.debian.org

Free Software Developer - http://arnaud.quette.fr

Conseiller Municipal - Saint Bernard du Touvet

bigon commented 10 years ago

I'm not sure there is an override.

Is it really a problem to store the pid of the process running as root instead of the unprivileged one?

bigon commented 10 years ago

Apparently the unpriv process complains and continues to run if the privileged process is killed

avr 23 20:31:05 fornost upsmon[9846]: upsmon parent process died - shutdown impossible
avr 23 20:31:05 fornost upsmon[9846]: Parent died - shutdown impossible
clepple commented 10 years ago

Stepping back, what is systemd trying to accomplish by watching the PID? If the intent is to restart upsmon if it is killed, then the right thing might be to use the -D flag to keep the parent process from going into the background. Then systemd can monitor it directly, and the PID file is still available to use for sending SIGHUP to the child to reread the configuration file (per the limitations in the upsmon man page).

bigon commented 10 years ago

Oh indeed we could prevent it to go into the background, this is even advised by systemd developers.

About reloading, we probably need to add the ExecReload= in the systemd service too then

iva2k commented 3 years ago

From the surface looking at it - sounds like there should be two different .pid files for upsmon run as root and upsmon run as nut/unprivileged user. If that's the case, it is a design bug. Should systemd take care of pid of upsmon run as root, and forked unprivileged pid of usbmon run as nut should be taken care of by the forking code?

Is unprivileged pid overwriting root pid and that breaks systemd and / or upsmon operation?

Even if that's not the case, the error messages in syslog do not look comforting and give no confidence that UPS shutdown will work correctly in all circumstances. Sounds like a critical issue. Can someone from NUT team chime in and triage this?

Here's a capture of sudo service nut-client status and ps from Ubuntu 18LTS, latest NUT package:

SERVICES STATUS : CLIENT =======================================================
● nut-monitor.service - Network UPS Tools - power device monitor and shutdown controller
   Loaded: loaded (/lib/systemd/system/nut-monitor.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2021-03-01 09:41:38 PST; 7s ago
  Process: 18311 ExecStart=/sbin/upsmon (code=exited, status=0/SUCCESS)
 Main PID: 18313 (upsmon)
    Tasks: 2 (limit: 4915)
   CGroup: /system.slice/nut-monitor.service
           ├─18312 /lib/nut/upsmon
           └─18313 /lib/nut/upsmon

Mar 01 09:41:38 fs1.a.com systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Mar 01 09:41:38 fs1.a.com upsmon[18311]: fopen /var/run/nut/upsmon.pid: No such file or directory
Mar 01 09:41:38 fs1.a.com upsmon[18311]: UPS: ups2@localhost (master) (power value 1)
Mar 01 09:41:38 fs1.a.com upsmon[18311]: Using power down flag file /etc/killpower
Mar 01 09:41:38 fs1.a.com upsmon[18312]: Startup successful
Mar 01 09:41:38 fs1.a.com systemd[1]: nut-monitor.service: Can't open PID file /var/run/nut/upsmon.pid (yet?) after start: No such file or direc
Mar 01 09:41:38 fs1.a.com systemd[1]: nut-monitor.service: Supervising process 18313 which is not our child. We'll most likely not notice when i
Mar 01 09:41:38 fs1.a.com systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.

RUNNING PROCESSES ==============================================================
USER       PID %CPU %MEM    VSZ   RSS TT       STAT  STARTED     TIME COMMAND         GROUP      GID
nut      18279  0.0  0.0  21996   456 ?        Ss   09:41:36 00:00:00 usbhid-ups      nut        130
nut      18281  0.0  0.0  38136   364 ?        Ss   09:41:36 00:00:00 upsd            nut        130
root     18312  0.0  0.0  35836  2764 ?        Ss   09:41:37 00:00:00 upsmon          root         0
nut      18313  0.0  0.0  50000  3936 ?        S    09:41:37 00:00:00 upsmon          nut        130

PID FILES ======================================================================
total 16
-rw-r--r-- 1 nut  nut  6 Mar  1 09:41 upsd.pid
-rw-r--r-- 1 root root 6 Mar  1 09:41 upsmon.pid
srw-rw---- 1 nut  nut  0 Mar  1 09:41 usbhid-ups-ups2
-rw-r--r-- 1 nut  nut  6 Mar  1 09:41 usbhid-ups-ups2.pid
gwaitsi commented 3 years ago

The below message re fopen also appears on freebsd variations i.e. freenas/truenas (although nut appears to work and shutdown while all related pids are created)

fopen /var/run/nut/upsmon.pid: No such file or directory fopen /var/db/nut/upsd.pid No such file or directory

electrofloat commented 3 years ago

So... what is the solution to this issue?

RJHsiao commented 3 years ago

Hi there, I get same message in my Ubuntu 20.04 LTS server, and no solution found. Is somebody work on it? Or the solution(s) is/are exist that we can google it with the keyword I don't know?

jimklimov commented 3 years ago

I keep meaning to take a closer look at this, but it keeps drowning in priorities :(

On Thu, May 13, 2021, 19:46 RJ Hsiao @.***> wrote:

Hi there, I get same message in my Ubuntu 20.04 LTS server, and no solution found. Is somebody work on it? Or the solution(s) is/are exist that we can google it with the keyword I don't know?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/networkupstools/nut/issues/123#issuecomment-840721832, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMPTFGJKKIQA6RWSPPOEXLTNQF53ANCNFSM4AOSH5RQ .

jimklimov commented 2 years ago

PR #683 (and #349 before it, now part of it) introduces a separation of debugging options vs. foreground/background running behavior, and in particular redefines the daemons under systemd units to run in foreground. Hopefully that change would alleviate this issue. Testing is welcome ;)

jimklimov commented 2 years ago

Playing around with the daemons and service units, for issues/PRs linked above, found an interesting behavior here:

When nut-monitor.service is initially started (newly as a foregrounded process without the extra forking for systemd, but still with forking to privileged/unprivileged pair), the "Main PID" for systemd is that of the (root-owned) parent process:

* nut-monitor.service - Network UPS Tools - power device monitor and shutdown controller
   Loaded: loaded (/lib/systemd/system/nut-monitor.service; disabled)
   Active: active (running) since Wed 2022-02-16 14:23:56 UTC; 3s ago
 Main PID: 24963 (upsmon)
   CGroup: /system.slice/nut-monitor.service
           ├─24963 /usr/local/ups/sbin/upsmon -F
           └─24964 /usr/local/ups/sbin/upsmon -F

Feb 16 14:23:56 mirabox systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 16 14:23:56 mirabox systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Feb 16 14:23:56 mirabox nut-monitor[24963]: fopen /var/run/upsmon.pid: No such file or directory
Feb 16 14:23:56 mirabox nut-monitor[24963]: Could not find PID file to see if previous upsmon instance is already running!
Feb 16 14:23:56 mirabox nut-monitor[24963]: UPS: nutdev1 (primary) (power value 1)
Feb 16 14:23:56 mirabox nut-monitor[24963]: Using power down flag file /etc/killpower

This only partially matches the other info: while "24963" is indeed the root parent, the recorded PIDFile value is that of the child:

# ps -ef | grep  upsmon
root     24963     1  0 14:23 ?        00:00:00 /usr/local/ups/sbin/upsmon -F
nobody   24964 24963  0 14:23 ?        00:00:00 /usr/local/ups/sbin/upsmon -F

# cat /run/upsmon.pid
24964

Systemd notices that after e.g. reloading the service unit:

# journalctl -flu nut-monitor &
# systemctl reload  nut-monitor
Feb 16 14:28:02 mirabox systemd[1]: Reloading Network UPS Tools - power device monitor and shutdown controller.
Feb 16 14:28:03 mirabox nut-monitor[24963]: Reloading configuration
Feb 16 14:28:03 mirabox nut-monitor[24974]: Network UPS Tools upsmon 2.7.4-4685-gc025b7e
root@mirabox:/home/bios/nut# Feb 16 14:28:03 mirabox systemd[1]: nut-monitor.service: Supervising process 24964 which is not our child. We'll most likely not notice when it exits.
Feb 16 14:28:03 mirabox systemd[1]: Reloaded Network UPS Tools - power device monitor and shutdown controller.

and then the reported "Main PID" matches it instead:

* nut-monitor.service - Network UPS Tools - power device monitor and shutdown controller
   Loaded: loaded (/lib/systemd/system/nut-monitor.service; disabled)
   Active: active (running) since Wed 2022-02-16 14:23:56 UTC; 4min 49s ago
  Process: 24974 ExecReload=/usr/local/ups/sbin/upsmon -c reload (code=exited, status=0/SUCCESS)
 Main PID: 24964 (upsmon)
   CGroup: /system.slice/nut-monitor.service
           ├─24963 /usr/local/ups/sbin/upsmon -F
           └─24964 /usr/local/ups/sbin/upsmon -F

Feb 16 14:23:56 mirabox systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Feb 16 14:23:56 mirabox nut-monitor[24963]: fopen /var/run/upsmon.pid: No such file or directory
Feb 16 14:23:56 mirabox nut-monitor[24963]: Could not find PID file to see if previous upsmon instance is already running!
Feb 16 14:23:56 mirabox nut-monitor[24963]: UPS: nutdev1 (primary) (power value 1)
Feb 16 14:23:56 mirabox nut-monitor[24963]: Using power down flag file /etc/killpower
Feb 16 14:28:02 mirabox systemd[1]: Reloading Network UPS Tools - power device monitor and shutdown controller.
Feb 16 14:28:03 mirabox nut-monitor[24963]: Reloading configuration
Feb 16 14:28:03 mirabox nut-monitor[24974]: Network UPS Tools upsmon 2.7.4-4685-gc025b7e
Feb 16 14:28:03 mirabox systemd[1]: nut-monitor.service: Supervising process 24964 which is not our child. We'll most likely not notice when it exits.
Feb 16 14:28:03 mirabox systemd[1]: Reloaded Network UPS Tools - power device monitor and shutdown controller.
jimklimov commented 2 years ago

One more aspect discussed above, about inability to open PID files like this:

fopen /var/run/upsmon.pid: No such file or directory
fopen /var/state/nut/upsd.pid No such file or directory

per investigation (and fixes) done during work on PR #1300 these are probably benign: these two daemons check if their earlier copy is already running, by looking at a PID file (if exists) and signaling the reported PID number. In case of first start after reboot (or clean restart of a service), these files do not exist and the fact is reported. With #1300 the reasons why such probing failed (no PID file, unparsable PID file, some error signalling a process) should now be logged in a less confusing manner, e.g. as seen above:

Feb 16 14:23:56 mirabox nut-monitor[24963]: fopen /var/run/upsmon.pid: No such file or directory
Feb 16 14:23:56 mirabox nut-monitor[24963]: Could not find PID file to see if previous upsmon instance is already running!
jimklimov commented 2 years ago

On a related note, actual drivers wrapped into systemd/SMF unit instances (with nut-driver-enumerator now part of NUT) could also benefit from not-forking when started via upsdrvctl. According to comments in the latter, it generally uses forkexec() because it may start many drivers in parallel. While someone might explore adding an option to forgo that fork when starting a single device; this was not addressed so far AFAIK.

Grandma-Betty commented 2 years ago

Came here looking for a solution for this issue. Will it get fixed in the next nut release? Is there a workaround? Here's what I get:

Feb 21 21:24:23 ubuntuserver systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 21 21:24:23 ubuntuserver upsmon[6887]: fopen /run/nut/upsmon.pid: No such file or directory
Feb 21 21:24:23 ubuntuserver upsmon[6887]: Using power down flag file /etc/killpower
Feb 21 21:24:23 ubuntuserver upsmon[6887]: UPS: ups@192.168.30.5 (slave) (power value 1)
Feb 21 21:24:23 ubuntuserver upsmon[6896]: Startup successful
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Can't open PID file /run/nut/upsmon.pid (yet?) after start: Operation not permitted
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Supervising process 6898 which is not our child. We'll most likely not notice when it exits.
Feb 21 21:24:23 ubuntuserver systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
dan commented 2 years ago

Came here looking for a solution for this issue. Will it get fixed in the next nut release? Is there a workaround? Here's what I get:

Feb 21 21:24:23 ubuntuserver systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 21 21:24:23 ubuntuserver upsmon[6887]: fopen /run/nut/upsmon.pid: No such file or directory
Feb 21 21:24:23 ubuntuserver upsmon[6887]: Using power down flag file /etc/killpower
Feb 21 21:24:23 ubuntuserver upsmon[6887]: UPS: ups@192.168.30.5 (slave) (power value 1)
Feb 21 21:24:23 ubuntuserver upsmon[6896]: Startup successful
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Can't open PID file /run/nut/upsmon.pid (yet?) after start: Operation not permitted
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Supervising process 6898 which is not our child. We'll most likely not notice when it exits.
Feb 21 21:24:23 ubuntuserver systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.

I have the same issue.

jimklimov commented 2 years ago

Not sure - there are some higher pressing priorities at the moment, at least on my side.

Thinking of the last week's investigation however, I wonder if the systemd unit "PIDFile=..." is needed here. Without it I suppose systemd would just track the parent (root) process. Anyhow it can not do much about the unprivileged child going AWOL, except restarting the parent to get them both alive again. Thinking of it more, maybe that was why PIDFile got there in the first place (to detect untimely demise of a child).

brianbloom commented 2 years ago

I think this is tripping up my shutdown scripts as well as I get the same log messages with PID problems. Or maybe I don't understand the shutdown workflow well enough. I have some bandwidth to help with testing if someone can advise what I should do.

jimklimov commented 2 years ago

Can you please check if service definitions in current NUT handle this better? At least, daemons should now run in foreground mode so one fork less.

brianbloom commented 2 years ago

@jimklimov (assuming that is addressed to me) I am running an apt installed version of 2.7.4. Does "current NUT" mean one of the 2.80 releases?

ioogithub commented 2 years ago

Came here looking for a solution for this issue. Will it get fixed in the next nut release? Is there a workaround? Here's what I get:

Feb 21 21:24:23 ubuntuserver systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 21 21:24:23 ubuntuserver upsmon[6887]: fopen /run/nut/upsmon.pid: No such file or directory
Feb 21 21:24:23 ubuntuserver upsmon[6887]: Using power down flag file /etc/killpower
Feb 21 21:24:23 ubuntuserver upsmon[6887]: UPS: ups@192.168.30.5 (slave) (power value 1)
Feb 21 21:24:23 ubuntuserver upsmon[6896]: Startup successful
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Can't open PID file /run/nut/upsmon.pid (yet?) after start: Operation not permitted
Feb 21 21:24:23 ubuntuserver systemd[1]: nut-monitor.service: Supervising process 6898 which is not our child. We'll most likely not notice when it exits.
Feb 21 21:24:23 ubuntuserver systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.

I am a new user and really struggling to get my shutdown script working, everything seems like it should but it it simply does not. This is the only error I can see. Can a dev or experienced user commend it this could be causing a problem with shutdown sequences or is this issue unrelated?

I am following this tutorial: https://forums.unraid.net/topic/93341-tutorial-networked-nut-for-cyberpower-ups/ and everything works up to the upssched.conf part.

jimklimov commented 1 year ago

@hawtkey: Depending on daemon, they are used in NUT generally to verify if another copy is running, or to send signals to it via command-line (e.g. commands to reload, FSD, etc), or to kill off older sibling to start a new one. Systemd is a relatively new kid on the block and not ubiquitous across OSes, so some tradeoffs still gotta get designed.

recklessnl commented 1 year ago

Having the exact same issue as well.

Is there no workaround in the meantime?

jimklimov commented 1 year ago

Run daemon foreground?

Grandma-Betty commented 1 year ago

@jimklimov Could you link an example on how to do that? I'm curious why the official Ubuntu repositories are waiting so long to go further, they're still on nut package 2.7.4. Maybe an update would fix some of this issues we all are having with ESXi hosts which could be related to the outdated libusb libraries.

recklessnl commented 1 year ago

I upgraded to the latest version of NUT today on my Debian 11 system and I'm still having the same issue. @jimklimov what's the easiest way to run the daemon foreground?

Sep 06 17:18:39 proxmox systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Sep 06 17:18:39 proxmox upsmon[9410]: fopen /run/nut/upsmon.pid: No such file or directory
Sep 06 17:18:39 proxmox upsmon[9410]: UPS: ups1@localhost (master) (power value 1)
Sep 06 17:18:39 proxmox upsmon[9410]: Using power down flag file /etc/killpower
Sep 06 17:18:39 proxmox systemd[1]: nut-monitor.service: Can't open PID file /run/nut/upsmon.pid (yet?) after start: Operation not permitted
Sep 06 17:18:39 proxmox upsmon[9411]: Startup successful
Sep 06 17:18:39 proxmox systemd[1]: nut-monitor.service: Supervising process 9413 which is not our child. We'll most likely not notice when it exits.
Sep 06 17:18:39 proxmox systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Sep 06 17:18:39 proxmox upsmon[9413]: Init SSL without certificate database
jimklimov commented 1 year ago

Can't really speak for distributions' cadence - that's outside the scope of NUT as an upstream project. From what I gather, @bigon worked on proposing an updated package recipe for "experimental" distro; and from there it would eventually trickle by backports into stable/LTS distros if nobody complains of regressions.

Actually there are a few issues fixes after NUT v2.8.0 release, and some outstanding (e.g. certain but not all CPS-like devices that talk rubbish on USB HID protocol were understood before and are not now that we check it more strictly). So maybe it would be an eventual NUT v2.8.1+ that would hit the stable distros.

jimklimov commented 1 year ago

As for running the daemon differently. it depends.

Assuming that you still have NUT v2.7.4 wrapped by systemd, you can either hackily change the unit definition in-place (systemctl status nut-monitor should show the path to the file to edit - and edits would be lost as the package is upgraded), or add a "drop-in" file which systemd daemon would merge in memory over the packaged definition. Either way, change the unit type from "forking" to default ("simple") and add the command-line option to ExecStart=.../upsmon line.

Unit definitions in NUT v2.8.0+ sources should actually include this. Then it is up to the distro what unit definitions they package - from NUT or inherited from their own older package recipe revisions.

jimklimov commented 1 year ago

Also, a shout-out to all who post "I have same issue": please, do detail which NUT version/build you have - this is an area where fixes are iterated, so no NUT is made equal ;)

And also, just to help me wrap my head around this: what "issue" do each of you have?

recklessnl commented 1 year ago

Thanks for the detailed response Jim! In my case, NUT with my UPS has worked for years without any hiccups, but for the last few weeks it's not been reliable anymore, with data going stale, and connections being refused. Here's my config, and keep in mind this used to work flawlessly for years with my Cyberpower UPS.

If I reboot the server, it works for like a day. All the attributes show up, and the pid errors discussed in this thread all show up right away, but it will work and communicate correctly with my UPS. But the next day, it starts failing with either connection failed or data stale, without fail.

I'm using NUT version 2.8.0-2, from the Debian Unstable repo. Currently running latest version of Proxmox, which is basically Debian 11.

ups.conf:

[ups1]
  driver = usbhid-ups
  port = auto
  desc = "Cyberpower UPS Server"
  pollinterval = 15

upsmon.conf:

POLLFREQ 8
DEADTIME 25

MONITOR ups1@localhost 1 upsmonitor my_password master
RUN_AS_USER nut
POWERDOWNFLAG /etc/killpower
SHUTDOWNCMD "/usr/sbin/shutdown -h now"
#NOTIFYCMD /etc/nut/notify
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
NOTIFYFLAG ONLINE SYSLOG+WALL+EXEC
NOTIFYFLAG COMMBAD SYSLOG+WALL+EXEC
NOTIFYFLAG COMMOK SYSLOG+WALL+EXEC
NOTIFYFLAG REPLBATT SYSLOG+WALL+EXEC
NOTIFYFLAG NOCOMM SYSLOG+EXEC
NOTIFYFLAG FSD SYSLOG+EXEC
NOTIFYFLAG SHUTDOWN SYSLOG+EXEC

After a while, upsmon service shows Data Stale:


Sep 08 16:15:30 proxmox upsmon[9413]: Poll UPS [ups1@localhost] failed - Data stale
Sep 08 16:15:38 proxmox upsmon[9413]: Poll UPS [ups1@localhost] failed - Data stale
Sep 08 16:15:46 proxmox upsmon[9413]: Poll UPS [ups1@localhost] failed - Data stale

service nut-driver status shows:

Sep 07 22:41:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: Cannot send after transport endpoint shutdown
Sep 07 22:41:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: No such device
Sep 07 22:42:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: Cannot send after transport endpoint shutdown
Sep 07 22:42:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: No such device
Sep 07 22:42:57 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: Cannot send after transport endpoint shutdown
Sep 07 22:42:57 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: No such device
Sep 07 22:44:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: Cannot send after transport endpoint shutdown
Sep 07 22:44:42 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: No such device
Sep 07 22:44:57 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: Cannot send after transport endpoint shutdown
Sep 07 22:44:57 proxmox usbhid-ups[9405]: libusb_get_report: error sending control message: No such device

@jimklimov , where should I put the debug_min and to what value should I set it to prevent this from happening? I'd love to get NUT working again.

jimklimov commented 1 year ago

Thanks for the details, though the particular issue here is likely not about upsmon pid.

libusb_get_report: error sending control message: No such device

this looks sinister... and there are many reports of CPSes getting reconnected (dmesg may confirm) - at which point AFAIK usually kernel grabs the "newly discovered" device and per udev rules should relinquish access to NUT user in OS.

Recent fixes included usbhid-ups ability to reconnect on the fly (hopefully getting permissions for the device back), with further fix in that area made approx. last week. So it may quite be that a custom build of current master would help.

As for debug_min - please see docs (man pages, config file settings) for the daemon in question. But that's about NUT debug setting (via config files instead of hacking init scripts), not about hardware connectivity flip-flops.

recklessnl commented 1 year ago

@jimklimov thanks for the reply.

Recent fixes included usbhid-ups ability to reconnect on the fly (hopefully getting permissions for the device back), with further fix in that area made approx. last week. So it may quite be that a custom build of current master would help.

Will this be upstreamed to the Debian unstable repo soon? Would be easier than maintaining a custom build. Still having issues with this.

jimklimov commented 1 year ago

AFAIK recipes were proposed, search NUT issues from this summer for "debian" or "ubuntu". What happens next is up to distros...

Notably -- not sure which service definitions they would use eventually (NUT's or their old ones)...

jimklimov commented 1 year ago

Note: recently had to dive into the code to see what code writes PID files into which locations; this is analyzed in https://github.com/networkupstools/nut/issues/1712#issuecomment-1327627850

UPDATE: ...and summarized the area in https://github.com/networkupstools/nut/wiki/Technicalities:-Work-with-PID-and-state-file-paths