networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
1.73k stars 335 forks source link

riello_usb NPW1000 errors #2291

Open d1nuc0m opened 5 months ago

d1nuc0m commented 5 months ago

Hello, I'm trying to use a Riello Net Power NPW1000 with NUT, (it is listed as supported) .

It is found by nut-scanner

$ sudo nut-scanner -U
[nutdev1]
        driver = "riello_usb"
        port = "auto"
        vendorid = "04B4"
        productid = "5500"
        product = "USB to Serial"
        vendor = "Cypress Semiconductor"
        bus = "003"

But when I edit /etc/ups/ups.conf, with this

[riellonpw1000]
  driver = riello_usb
  port = auto
  vendorid = "04B4"
  productid = "5500"
  product = "USB to Serial"
  vendor = "Cypress Semiconductor"

sudo riello_usb -a riellonpw1000 -DD leads to Fatal error: 'vendorid' is not a valid variable name for this driver.. If removed, same error for productid, product and vendor.

while with this

[riellonpw1000]
  driver = riello_usb
  port = auto

leads to Failed to open device (...), skipping: Access denied (insufficient permissions) for all the possible devices

System Almalinux 9.3, nut-client-2.8.0-3m, nut-2.8.0-3, Riello USB driver 0.07 (2.8.0)

jimklimov commented 5 months ago

Thanks for the report, the part about unknown keywords looks related to #1763 which should fix this issue, but was merged after NUT v2.8.0 release - should be in 2.8.1 (or custom builds of current master).

The "Access denied" part may be due to lack of 04b4 in udev rules, upower-hid and similar mappings (whichever applies to your distro), which tell the kernel that certain vendor/product IDs should be handed off to nut run-time user. For some reason, this ID did not show up in such files (also on current master), so the kernel keeps owning the devfs node. As a quick fix, try adding user=root to the ups.conf section, so the driver would not drop privileges and remain able to open the device node - if that is all there is to this problem.

I'll check why it does not get listed, though.

jimklimov commented 5 months ago

So... it does appear in many of the files, e.g.:

nut$ grep -ri 04b4 .
./drivers/riello_usb.c:#define RIELLO_VENDORID 0x04b4
./scripts/hotplug/libhid.usermap:libhidups      0x0003      0x04b4   0x5500    0x0000       0x0000       0x00         0x00            0x00            0x00            0x00               0x00               0x00000000
./scripts/devd/nut-usb.quirks:hw.usb.quirk.22="0x04b4 0x5500 0x0000 0xffff UQ_HID_IGNORE"
./scripts/devd/nut-usb.conf:    match "vendor"          "0x04b4";
./scripts/devd/nut-usb.conf.in: match "vendor"          "0x04b4";
./scripts/Solaris/nut-usb-driver.p5m.include.in:        alias="usb04b4,5500.*" \
./scripts/Solaris/nut-usb-driver.p5m.include:   alias="usb04b4,5500.*" \
./scripts/udev/smartnut-usbups.rules.in:ATTR{idVendor}=="04b4", ATTR{idProduct}=="5500", MODE="664", GROUP="@RUN_AS_GROUP@", RUN+="FIXME...nutdrvctl...FIXME"
./scripts/udev/nut-usbups.rules.in:ATTR{idVendor}=="04b4", ATTR{idProduct}=="5500", MODE="664", GROUP="@RUN_AS_GROUP@"
./scripts/udev/smartnut-usbups.rules:ATTR{idVendor}=="04b4", ATTR{idProduct}=="5500", MODE="664", GROUP="nogroup", RUN+="FIXME...nutdrvctl...FIXME"
./scripts/udev/62-nut-usbups.rules:ATTR{idVendor}=="04b4", ATTR{idProduct}=="5500", MODE="664", GROUP="nogroup"
./scripts/udev/nut-usbups.rules:ATTR{idVendor}=="04b4", ATTR{idProduct}=="5500", MODE="664", GROUP="nogroup"
./tools/nut-scanner/nutscan-usb.h:      { 0x04b4, 0x5500, "riello_usb", NULL },

(NOTE: nogroup should not appear in production builds/packages; this example is from a minimally configured dev build)

But not in ./scripts/upower/95-upower-hid.hwdb :\

In system-installed area on a Debian-derived machine I see /usr/lib/udev/rules.d/62-nut-usbups.rules and it does have 04b4.

jimklimov commented 5 months ago

So, this bit of mystery is solved: the script behind those files only generates entries for upower-hid database for known USB/HID device drivers, not for all USB-capable NUT drivers: https://github.com/networkupstools/nut/blame/76045f9a0217e881996a0c67281966ca4ff7fe70/tools/nut-usbinfo.pl#L235-L247

@aquette : WDYT, is this assumption still relevant? Is there some other database for "USB non-HID" devices for these platforms, or should we better make one in similar fashion?

d1nuc0m commented 5 months ago

As a quick fix, try adding user=root to the ups.conf section, so the driver would not drop privileges and remain able to open the device node - if that is all there is to this problem.

If it can help, with user=root I get

$ sudo riello_usb -a riellonpw1000 -DD
Network UPS Tools - Riello USB driver 0.07 (2.8.0)
Warning: This is an experimental driver.
Some features may not function correctly.
(...)
   0.235611     [D2] Checking device 9 of 18 (04B4/5500)
   0.246993     [D2] - VendorID: 04b4
   0.247018     [D2] - ProductID: 5500
   0.247026     [D2] - Manufacturer: Cypress Semiconductor
   0.247033     [D2] - Product: USB to Serial
   0.247047     [D2] - Serial Number: unknown
   0.247055     [D2] - Bus: 003
   0.247062     [D2] - Device: unknown
   0.247068     [D2] - Device release number: 0000
   0.247074     [D2] Trying to match device
   0.247083     [D2] Device matches
   0.247089     [D2] Reading first configuration descriptor
   0.247105     [D2] successfully set kernel driver auto-detach flag
   0.247459     [D2] Claimed interface 0 successfully
   0.249974     [D2] HID descriptor length 37
   0.252987     [D2] Report descriptor retrieved (Reportlen = 37)
   0.253003     [D2] Found HID device
   0.253025     [D2] entering start_ups_comm()

   1.102403     [D2] Communication with UPS established
   1.662625     [D1] countlost 0
   2.334367     [D1] get_ups_status() 0
   2.798588     [D2] dstate_init: sock /run/nut/riello_usb-riellonpw1000 open on fd 9
   2.798800     [D1] Group and/or user account for this driver was customized ('root:dialout') compared to built-in defaults. Fixing socket '/run/nut/riello_usb-riellonpw1000' ownership/access.
   2.799287     [D1] Group access for this driver successfully fixed
   2.799297     [D1] countlost 0
   3.470590     [D1] get_ups_status() 0
   4.800281     [D1] countlost 0
   5.470467     [D1] get_ups_status() 0

And then it continues looping

[D1] get_ups_status() 0
[D1] countlost 0
jimklimov commented 5 months ago

To see if it gets data, try a "dumping mode" to print what it saw after a few (e.g. 1) loops in a way similar to upsc client reports, and exit. Also it generally might help to bump debug verbosity for the exploratory run, but looking at code the two repeating lines in this case mean it succeeded to get_ups_status() and overall had zero failed attempts to get a reading.

Mixing the two suggestions:

sudo riello_usb -a riellonpw1000 -DDDDDD -d1
jimklimov commented 5 months ago

Ultimately, if it does report UPS data here, then the whole stack running as services is also expected to behave well.

d1nuc0m commented 5 months ago

sudo riello_usb -a riellonpw1000 -DDDDDD -d1

   0.234560     [D2] Checking device 9 of 18 (04B4/5500)
   0.245526     [D2] - VendorID: 04b4
   0.245543     [D2] - ProductID: 5500
   0.245549     [D2] - Manufacturer: Cypress Semiconductor
   0.245555     [D2] - Product: USB to Serial
   0.245563     [D2] - Serial Number: unknown
   0.245571     [D2] - Bus: 003
   0.245577     [D2] - Device: unknown
   0.245583     [D2] - Device release number: 0000
   0.245589     [D2] Trying to match device
   0.245595     [D3] match_function_regex: matching a device...
   0.245603     [D2] Device matches
   0.245610     [D2] Reading first configuration descriptor
   0.245623     [D3] libusb_kernel_driver_active() returned 0
   0.245650     [D2] Claimed interface 0 successfully
   0.245659     [D3] nut_usb_set_altinterface: skipped libusb_set_interface_alt_setting(udev, 0, 0)
   0.248524     [D3] HID descriptor, method 1: (9 bytes) => 09 21 00 01 00 01 22 25 00
   0.248531     [D3] HID descriptor length (method 1) 37
   0.248537     [D4] i=0, extra[i]=09, extra[i+1]=21
   0.248544     [D3] HID descriptor, method 2: (9 bytes) => 09 21 00 01 00 01 22 25 00
   0.248551     [D3] HID descriptor length (method 2) 37
   0.248558     [D2] HID descriptor length 37
   0.251526     [D2] Report descriptor retrieved (Reportlen = 37)
   0.251534     [D2] Found HID device
   0.251547     [D5] send_to_all: SETINFO ups.vendorid "04b4"
   0.251555     [D5] send_to_all: SETINFO ups.productid "5500"
   0.251566     [D5] send_to_all: SETINFO driver.version "2.8.0"
   0.251573     [D5] send_to_all: SETINFO driver.version.internal "0.07"
   0.251581     [D5] send_to_all: SETINFO driver.name "riello_usb"
   0.251587     [D2] entering start_ups_comm()

   0.254527     [D3] send: features report ok
   0.667141     [D3] send ok
   0.674927     [D3] read: FFFFFFF0 00 00 00 00 00 00 00
   0.690854     [D3] read: FFFFFFF7 02 22 20 15 33 30 30
   0.690869     [D5] Header detected: LAST_DATA:0,0,0,0,2,22  buf_ptr:0  

   0.706852     [D3] read: FFFFFFF5 30 30 3E 3A 03 00 00
   0.706865     [D5] 
End detected: LAST_DATA:30,30,30,3E,3A,3  buf_ptr:12  

   0.716935     [D3] in read: 12
   0.716946     [D3] riello_command ok: 12
   0.716954     [D3] Get identif Ko: command not supported
   0.716963     Bad checksum or NACK
jimklimov commented 5 months ago

CC @mzampieri70 : Cheers, would you have any ideas about this, please?

mzampieri70 commented 5 months ago

Hi this seems a NAK problem on the USB communication, please try to isolate the USB cable from the mains cable as first stage. Regards.

d1nuc0m commented 5 months ago

try to isolate the USB cable from the mains cable

The USB cable already follows a different path from the AC/power cables

jimklimov commented 5 months ago

@d1nuc0m : just in case, is this a shielded USB cable (foiled in the cover - so grounds of the devices match and EMI does not pass), or can you try such a cable? Maybe it can help to just unplug-replug a few times in case the (non-gilded) connectors got oxidized - this would scratch the film off and improve the signal-to-noise ratio for a while...