Open martin3000 opened 1 year ago
Hello, for investigation, you would need to bump debug verbosity for the driver (maybe even to level 6 if your storage permits) to see how it ticks and the last words when it does not. On older NUT releases with systemd services it is a bit more complicated than with init-scripts or with NUT 2.8.0+, see https://github.com/networkupstools/nut/wiki/Changing-NUT-daemon-debug-verbosity for details.
In my older experience there was a device that went AWOL randomly - sometimes after a week, sometimes after 20 minutes. The libusb layer just hung talking to it, I've never reliably tracked down why it fails and had to craft https://github.com/networkupstools/nut/blob/master/scripts/Solaris/reset-ups-usb-solaris.sh.sample to soft-replug the device and restart the drivers. A somewhat similarly harsh approach is documented at https://github.com/networkupstools/nut/tree/master/scripts/usb_resetter for Linux.
Drivers in NUT 2.8.0+ should be a bit more reliable at detecting their own hangs or commanding their older self to die (and so release the UPS device filesystem node) when starting a new instance.
Thank you for looking into it. From the latest ubuntu packages, I extracted usbhid-ups 2.8 and I am using it now with nut 2.7.4 But the problem remains the same.
"journalctl -u nut-driver" shows:
Jul 10 13:04:26 nuc usbhid-ups-2.8[1261]: nut_libusb_get_report: Input/Output Error
Jul 10 13:04:26 nuc usbhid-ups-2.8[1261]: nut_libusb_get_report: Input/Output Error
Jul 10 13:04:26 nuc usbhid-ups-2.8[1261]: nut_libusb_get_report: Input/Output Error
Jul 10 13:04:26 nuc usbhid-ups-2.8[1261]: nut_libusb_get_report: Input/Output Error
Jul 10 13:04:28 nuc usbhid-ups-2.8[1261]: nut_libusb_get_interrupt: Input/Output Error
Jul 10 13:04:29 nuc usbhid-ups-2.8[1261]: libusb1: Could not open any HID devices: insufficient permissions on everything
Doing "sudo /lib/nut/usbhid-ups-2.8 -a ups1 -DDDDD" I get:
0.000000 [D1] debug level is '5'
0.003820 [D5] send_to_all: SETINFO device.type "ups"
0.003842 [D2] Initializing an USB-connected UPS with library libusb-1.0.25 (API: 0x1000109) (NUT subdriver name='USB communication driver (libusb 1.0)' ver='0.43')
0.003855 [D1] upsdrv_initups (non-SHUT)...
0.016038 [D2] Checking device 1 of 11 (152D/0579)
0.016080 [D1] Failed to open device (152D/0579), skipping: Access denied (insufficient permissions)
0.016101 [D2] Checking device 2 of 11 (2109/0813)
0.016128 [D1] Failed to open device (2109/0813), skipping: Access denied (insufficient permissions)
0.016143 [D2] Checking device 3 of 11 (1D6B/0003)
0.016166 [D1] Failed to open device (1D6B/0003), skipping: Access denied (insufficient permissions)
0.016179 [D2] Checking device 4 of 11 (8087/0AAA)
0.016206 [D1] Failed to open device (8087/0AAA), skipping: Access denied (insufficient permissions)
0.016223 [D2] Checking device 5 of 11 (046D/C517)
0.016252 [D1] Failed to open device (046D/C517), skipping: Access denied (insufficient permissions)
0.016268 [D2] Checking device 6 of 11 (06DA/FFFF)
0.016961 [D1] nut_libusb_open get iManufacturer failed, retrying...
0.017525 [D1] nut_libusb_open get iManufacturer failed, retrying...
0.018028 [D1] nut_libusb_open get iManufacturer failed, retrying...
0.018524 [D1] nut_libusb_open get iProduct failed, retrying...
0.019145 [D1] nut_libusb_open get iProduct failed, retrying...
0.019707 [D1] nut_libusb_open get iProduct failed, retrying...
0.020187 [D1] nut_libusb_open get iSerialNumber failed, retrying...
0.020756 [D1] nut_libusb_open get iSerialNumber failed, retrying...
0.021228 [D1] nut_libusb_open get iSerialNumber failed, retrying...
0.021242 [D2] - VendorID: 06da
0.021252 [D2] - ProductID: ffff
0.021266 [D2] - Manufacturer: unknown
0.021277 [D2] - Product: unknown
0.021289 [D2] - Serial Number: unknown
0.021302 [D2] - Bus: 001
0.021311 [D2] - Device: unknown
0.021329 [D2] - Device release number: 0003
0.021339 [D2] Trying to match device
0.021351 [D2] match_function_subdriver (non-SHUT mode): matching a device...
0.021372 [D3] match_function_regex: matching a device...
0.021393 [D2] Device matches
0.021412 [D2] Reading first configuration descriptor
0.021443 [D3] libusb_kernel_driver_active() returned 0
0.021484 [D2] Claimed interface 0 successfully
0.021505 [D3] nut_usb_set_altinterface: skipped libusb_set_interface_alt_setting(udev, 0, 0)
0.021987 [D2] Unable to get HID descriptor (Input/Output Error)
0.022008 [D3] HID descriptor length (method 1) -1
0.022028 [D4] i=0, extra[i]=09, extra[i+1]=21
0.022044 [D3] HID descriptor, method 2: (9 bytes) => 09 21 00 01 00 01 22 ef 01
0.022075 [D3] HID descriptor length (method 2) 495
0.022093 [D2] HID descriptor length 495
0.022726 [D2] Unable to get Report descriptor: Resource temporarily unavailable
0.022774 [D2] Checking device 7 of 11 (0403/6015)
0.022811 [D1] Failed to open device (0403/6015), skipping: Access denied (insufficient permissions)
0.022823 [D2] Checking device 8 of 11 (10C4/EA60)
0.022853 [D1] Failed to open device (10C4/EA60), skipping: Access denied (insufficient permissions)
0.022874 [D2] Checking device 9 of 11 (0C45/51F4)
0.022905 [D1] Failed to open device (0C45/51F4), skipping: Access denied (insufficient permissions)
0.022922 [D2] Checking device 10 of 11 (2109/2813)
0.022947 [D1] Failed to open device (2109/2813), skipping: Access denied (insufficient permissions)
0.022964 [D2] Checking device 11 of 11 (1D6B/0002)
0.022991 [D1] Failed to open device (1D6B/0002), skipping: Access denied (insufficient permissions)
0.023005 [D2] libusb1: No appropriate HID device found
0.023023 libusb1: Could not open any HID devices: insufficient permissions on everything
0.023043 No matching HID UPS found
So the problem is "Unable to get Report descriptor: Resource temporarily unavailable"
After unplugging the UPS:
0.000000 [D1] debug level is '5'
0.003315 [D5] send_to_all: SETINFO device.type "ups"
0.003335 [D2] Initializing an USB-connected UPS with library libusb-1.0.25 (API: 0x1000109) (NUT subdriver name='USB communication driver (libusb 1.0)' ver='0.43')
0.003346 [D1] upsdrv_initups (non-SHUT)...
0.013412 [D2] Checking device 1 of 11 (152D/0579)
0.013445 [D1] Failed to open device (152D/0579), skipping: Access denied (insufficient permissions)
0.013459 [D2] Checking device 2 of 11 (2109/0813)
0.013480 [D1] Failed to open device (2109/0813), skipping: Access denied (insufficient permissions)
0.013492 [D2] Checking device 3 of 11 (1D6B/0003)
0.013511 [D1] Failed to open device (1D6B/0003), skipping: Access denied (insufficient permissions)
0.013521 [D2] Checking device 4 of 11 (8087/0AAA)
0.013544 [D1] Failed to open device (8087/0AAA), skipping: Access denied (insufficient permissions)
0.013556 [D2] Checking device 5 of 11 (046D/C517)
0.013577 [D1] Failed to open device (046D/C517), skipping: Access denied (insufficient permissions)
0.013607 [D2] Checking device 6 of 11 (06DA/FFFF)
0.018950 [D2] - VendorID: 06da
0.018987 [D2] - ProductID: ffff
0.018997 [D2] - Manufacturer: PPC
0.019006 [D2] - Product: Offline UPS
0.019018 [D2] - Serial Number: 000000000
0.019030 [D2] - Bus: 001
0.019040 [D2] - Device: unknown
0.019054 [D2] - Device release number: 0003
0.019067 [D2] Trying to match device
0.019081 [D2] match_function_subdriver (non-SHUT mode): matching a device...
0.019100 [D3] match_function_regex: matching a device...
0.019114 [D2] Device matches
0.019125 [D2] Reading first configuration descriptor
0.019153 [D3] libusb_kernel_driver_active() returned 1 (driver active)
0.019166 [D2] successfully set kernel driver auto-detach flag
0.019836 [D2] Claimed interface 0 successfully
0.019850 [D3] nut_usb_set_altinterface: skipped libusb_set_interface_alt_setting(udev, 0, 0)
0.020571 [D3] HID descriptor, method 1: (9 bytes) => 09 21 00 01 00 01 22 ef 01
0.020586 [D3] HID descriptor length (method 1) 495
0.020597 [D4] i=0, extra[i]=09, extra[i+1]=21
0.020624 [D3] HID descriptor, method 2: (9 bytes) => 09 21 00 01 00 01 22 ef 01
0.020748 [D3] HID descriptor length (method 2) 495
0.020781 [D2] HID descriptor length 495
0.036958 [D2] Report Descriptor size = 495
Just in case: while it was "Unable to get Report descriptor: Resource temporarily unavailable"
- was the other driver instance surely stopped?
Also, a bit surprised that it worked for you as /lib/nut/usbhid-ups-2.8
-- I think many drivers check the base filename to make sure the driver=...
from config matches. Wondering now if it looks at just the beginning of the string, not all of it (e.g. to ignore possible extensions), and if it should...
I renamed the 2.8 version to usbhid-ups-2.8. And I changed the .conf file. Currently it is running with debug output going to a file. It seems that the problem with the 2.8 driver is not the same as with the 2.7.4 driver.
It seems that the problem with the 2.8 driver is not the same as with the 2.7.4 driver.
Do you mean the "Resource unavailable"? Probably it is the root-cause of problem with 2.7.4 as well. In 2.8.0 if something like this happens during run-time, the driver should be able to decide that it wants to close and re-open the device (almost like restarting the driver), which may succeed or not; the older versions would have tried talking to the effectively absent device.
It is however up to the device itself (and the OS) whether it goes AWOL (e.g. some CPS-related UPS chips allegedly tend to go into power-saving in the middle of monitoring) and how it recovers (does udev
or similar system change ownership to nut:nut
so the driver may access it again)...
With 2.7.4 I could do a "sudo upsdrvctl start" and it was working again. This did not work with 2.8
Does it report any issues when "not working" this way?
By the way, note that is your NUT package is set up to use systemd for the daemons (including drivers), the manually driven upsdrvctl
may be problematic:
upsdrvctl
startsupsdrvctl
and startsAlso note that the 2.8.0+ version of the driver should support a debug_min
setting in ups.conf
(maybe the 2.7.4 upsd
would dislike it though...) If it works, setting up boosted debugging would be easier. The older upsdrvctl
would hide most of it though when forking the driver.
note that is your NUT package is set up to use systemd for the daemons (including drivers), the manually driven
upsdrvctl
may be problematic
So the right way would be a "sudo systemctl restart nut-driver.service"? I don't think that systemd tries to restart the service. Currently the service is in a failed state and I'm running the driver stand-alone for debugging purpose.
Normally yes; with NUT 2.8.0 it is even better fine-grained with a service unit instance separate for each driver like nut-driver@UPSNAME.service
(caveats apply, maybe not exactly UPSNAME
). With current master it also taps into systemd-notify mechanism (sort of a software-watchdog) so a frozen driver is detected as unhealthy and restarted.
If in your case the service is stopped, then command-line tests should be OK in terms of (non-)conflicts.
Next question: do I get it right that with your usbhid-ups-2.8
, there is some problem using upsdrvctl start
- but no problem running the driver binary directly (as in earlier screenshots)? Did you try renaming away the packaged binary and using the newer 2.8 one as usbhid-ups
verbatim - does any behavior change then?
With the 2.7.4 driver, "upsdrvctl start" always worked, with the 2.8 driver it did not work. But give me some more days to test it because it normally runs nearly a day before it fails.
For debugging purposes, I started the driver with "sudo /lib/nut/usbhid-ups-2.8 -a ups1 -DDDDD" and it is now running fine for 3 days. "nut-driver.service" ist not running, so currently I'm not using upsdrvctl. I wait 2 more days and then I'm trying it to run normally thru the nut-driver.service. Maybe there is a difference.
Thanks for the update!
So, now it failed:
275883.763669 [D4] ==================================================
275883.763683 [D4] = device has been disconnected, try to reconnect =
275883.763698 [D4] ==================================================
275883.763732 [D2] Checking device 1 of 11 (152D/0579)
275883.763777 [D1] Failed to open device (152D/0579), skipping: Access denied (insufficient permissions)
275883.763793 [D2] Checking device 2 of 11 (2109/0813)
275883.763816 [D1] Failed to open device (2109/0813), skipping: Access denied (insufficient permissions)
275883.763831 [D2] Checking device 3 of 11 (1D6B/0003)
275883.763856 [D1] Failed to open device (1D6B/0003), skipping: Access denied (insufficient permissions)
275883.763871 [D2] Checking device 4 of 11 (8087/0AAA)
275883.763897 [D1] Failed to open device (8087/0AAA), skipping: Access denied (insufficient permissions)
275883.763912 [D2] Checking device 5 of 11 (046D/C517)
275883.763935 [D1] Failed to open device (046D/C517), skipping: Access denied (insufficient permissions)
275883.763950 [D2] Checking device 6 of 11 (06DA/FFFF)
275883.764626 [D1] nut_libusb_open get iManufacturer failed, retrying...
275883.765099 [D1] nut_libusb_open get iManufacturer failed, retrying...
275883.765707 [D1] nut_libusb_open get iManufacturer failed, retrying...
275883.766173 [D1] nut_libusb_open get iProduct failed, retrying...
275883.766707 [D1] nut_libusb_open get iProduct failed, retrying...
275883.767173 [D1] nut_libusb_open get iProduct failed, retrying...
275883.767747 [D1] nut_libusb_open get iSerialNumber failed, retrying...
275883.768234 [D1] nut_libusb_open get iSerialNumber failed, retrying...
275883.768824 [D1] nut_libusb_open get iSerialNumber failed, retrying...
275883.768841 [D2] - VendorID: 06da
275883.768856 [D2] - ProductID: ffff
275883.768872 [D2] - Manufacturer: unknown
275883.768887 [D2] - Product: unknown
275883.768903 [D2] - Serial Number: unknown
275883.768918 [D2] - Bus: 001
275883.768933 [D2] - Device: unknown
275883.768949 [D2] - Device release number: 0003
275883.768964 [D2] Trying to match device
275883.768980 [D2] match_function_subdriver (non-SHUT mode): matching a device...
275883.769002 [D3] match_function_regex: matching a device...
275883.769020 [D3] match_function_exact: matching a device...
275883.769037 [D2] match_function_exact: failed match of Vendor: (null) != PPC
275883.769052 [D2] Device does not match - skipping
275883.769080 [D2] Checking device 7 of 11 (0403/6015)
275883.769111 [D1] Failed to open device (0403/6015), skipping: Access denied (insufficient permissions)
275883.769127 [D2] Checking device 8 of 11 (10C4/EA60)
275883.769152 [D1] Failed to open device (10C4/EA60), skipping: Access denied (insufficient permissions)
275883.769167 [D2] Checking device 9 of 11 (0C45/51F4)
275883.769190 [D1] Failed to open device (0C45/51F4), skipping: Access denied (insufficient permissions)
275883.769205 [D2] Checking device 10 of 11 (2109/2813)
275883.769227 [D1] Failed to open device (2109/2813), skipping: Access denied (insufficient permissions)
275883.769243 [D2] Checking device 11 of 11 (1D6B/0002)
275883.769266 [D1] Failed to open device (1D6B/0002), skipping: Access denied (insufficient permissions)
275883.769281 [D2] libusb1: No appropriate HID device found
275883.769296 libusb1: Could not open any HID devices: insufficient permissions on everything
A disconnect already happend yesterday, but a reconnect was possible. Now the reconnect does not work and the problem seems to be that nut_libusb_open cannot get the device properties. lsusb still shows the USB device.
lsusb -v for the UPS while usbhid-ups is not working:
Bus 001 Device 013: ID 06da:ffff Phoenixtec Power Co., Ltd Offline UPS
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 1.10
bDeviceClass 0
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 8
idVendor 0x06da Phoenixtec Power Co., Ltd
idProduct 0xffff
bcdDevice 0.03
iManufacturer 1 PPC
iProduct 2 Offline UPS
iSerial 4 000000000
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0022
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 3 Human Interface Device
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
HID Device Descriptor:
bLength 9
bDescriptorType 33
bcdHID 1.00
bCountryCode 0 Not supported
bNumDescriptors 1
bDescriptorType 34 Report
wDescriptorLength 495
Report Descriptors:
** UNAVAILABLE **
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes
bInterval 100
It seems possible to retrieve iManufacturer and iProduct.
"sudo usbhid-dump -s 1:13" gives
001:013:000:Failed to retrieve report descriptor: Input/Output Error
After plugging the UPS USB out and in it works.
Curious... Haven't used that tool myself TBH, but it seems to imply the problem is systemic rather than NUT or its interaction with libusb. Maybe the USB chip on UPS side hangs and something like usb_resetter
mentioned above is in order.
I'm testing again with the 2.7.4 usbhid-ups driver. I have running it standalone in debug mode. When it stopped working, it was still running (but hanging). Nothing special in the logs. I just cancelled it and started it again. I did not have to unplug the usb. So that is not the same as with the 2.8 driver.
Out of curiosity: could it be that 2.8.0 and 2.7.4 somehow differently initialize or release the device?
How would it behave when 2.7.4 was running, stopped working, and then you tried starting 2.8.0? And vice versa?
The 2.7.4 driver stops working nearly every day, but this can be fixed with "sudo upsdrvctl start". I never had to unplug the usb cable (which was necessary with the 2.8 driver). So the 2.7.4 driver does not work because it hangs, and the 2.8 driver does not work because after a few days, the usb port stops working.
After usbhid-ups 2.7.4 stopped every day, I switched again to version 2.8. After it stopped, the usb port was somehow broken and a restart of the driver was not possible. But usb_resetter.py --reset-device -device 06da:ffff
did help. The usb port was repaired and I could run usbhid-ups 2.8 again.
Thanks for the confirmation, that's relatively good news :}
I have a powerwalker Vi 850 SH with the usbhid-ups driver on ubuntu, nut 2.7.4. Every day, "upsc ups1" hangs, because the ups driver is unavailable. I can fix it with "sudo upsdrvctl start", but the next day it is unavailable again. I don't see anything useful in the syslog.
It seems that the driver crashes or hangs. How can I test this?
driver.version: 2.7.4 driver.version.data: MGE HID 1.40 driver.version.internal: 0.41