networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
1.66k stars 333 forks source link

logging "nut_libusb_get_string: Success" every 12 seconds #2399

Closed fenugrec closed 1 month ago

fenugrec commented 1 month ago

journalctl log is being filled by two log messages every 12 seconds :

Apr 14 10:30:06 machin nut-driver@belk[476]: nut_libusb_get_string: Success
Apr 14 10:30:06 machin usbhid-ups[476]: nut_libusb_get_string: Success
Apr 14 10:30:18 machin nut-driver@belk[476]: nut_libusb_get_string: Success
Apr 14 10:30:18 machin usbhid-ups[476]: nut_libusb_get_string: Success
....

Looks like maybe here https://github.com/networkupstools/nut/blob/7f36d4909d2b194b2126b7e68df08046adfffe62/drivers/libusb1.c#L1022C8-L1022C42 , libusb is probably returning 0 ? libusb_get_string_descriptor_ascii docs say it returns

number of bytes returned in data, or LIBUSB_ERROR code on failure

and LIBUSB_SUCCESS = 0, so this would explain the message.

My questions :

[EDIT] hardware info :

battery.charge: 100
battery.charge.low: 30
battery.charge.warning: 30
battery.runtime: 120
battery.type: PbAc
battery.voltage: 138.0
battery.voltage.nominal: 12
device.mfr: Belkin
device.model: UPS
device.serial: 
device.type: ups
driver.debug: 0
driver.flag.allow_killpower: 0
driver.name: usbhid-ups
driver.parameter.pollfreq: 10
driver.parameter.pollinterval: 2
driver.parameter.port: auto
driver.parameter.synchronous: auto
driver.state: quiet
driver.version: 2.8.1
driver.version.data: Belkin/Liebert HID 0.18
driver.version.internal: 0.52
driver.version.usb: libusb-1.0.27 (API: 0x100010a)
input.voltage: 123.5
input.voltage.nominal: 120
ups.beeper.status: enabled
ups.delay.shutdown: 20
ups.delay.start: 30
ups.firmware: 0
ups.load: 49
ups.mfr: Belkin
ups.model: UPS
ups.power.nominal: 550
ups.productid: 0551
ups.serial: 
ups.status: OL CHRG
ups.test.result: Done and passed
ups.timer.shutdown: 0
ups.timer.start: 0
ups.type: offline
ups.vendorid: 050d

Thanks !

jimklimov commented 1 month ago

Thanks for the report! It seems the actual report comes from libusb1.c::nut_libusb_strerror() or libusb0.c::libusb_strerror() which classifies libusb return codes vs. known error codes, and for some of these logs what it saw.

Since LIBUSB_SUCCESS was not in fact handled explicitly, it probably falls under default handling there as e.g.

upslogx(LOG_DEBUG, "%s: %s", desc, libusb_strerror((enum libusb_error)ret));

It is this way since at least commit 218d74a2eb (in 2010) which refactored many similar printouts all around libusb.c into this one method, and later cloned when libusb1 support was added by #300.

So one take-away is that your system happens to journal syslog debug priority messages (which is probably not a common case, or there would have been more complaints about this). Another is that the success case should probably be handled as a quieter upsdebugx() indeed :)

jimklimov commented 1 month ago

As for regular logging - the NUT drivers do poll for device information changes (some may also be expecting interrupts, depending on settings). A bigger question may be your second one - whether it is a device error to return an empty string for some reading, or if a particular libusb's build of libusb_get_string_descriptor_ascii returns 0 for success and not a length.

jimklimov commented 1 month ago

Can you please try the proposed PR, if a driver built from that codebase would work better?

:; git clone https://github.com/jimklimov/nut -b issue-2399
:; cd nut
...

following e.g. https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests for other build/test instructions

fenugrec commented 1 month ago

@jimklimov Wow that was fast, thanks for looking into this. Will try to find time to test your PR within a few days.

In the meantime I tweaked ups.conf and added debug_min = 4 , trying to see what descriptor it was trying to read (found another interesting thing too, will file a separate issue) :

   36.269210        [D1] Full update...                                                                                                                                                       
   36.269220        [D4] Entering libusb_get_report                                                                                                                                           
   36.271679        [D3] Report[get]: (2 bytes) => 21 64                                                                                                                                      
   36.271718        [D2] Path: UPS.BELKINBatterySystem.BELKINCharge, Type: Feature, ReportID: 0x21, Offset: 0, Size: 8, Value: 100                                                            
   36.271733        [D4] Entering libusb_get_report                                                                                                                                           
   36.274710        [D3] Report[get]: (2 bytes) => 3a 1e                                                                                                                                      
(trimmed)
   36.312765        [D4] Entering libusb_get_report
   36.315709        [D3] Report[get]: (2 bytes) => 29 1a
   36.315748        [D2] Path: UPS.PowerSummary.iSerialNumber, Type: Feature, ReportID: 0x29, Offset: 0, Size: 8, Value: 26
   36.321731        nut_libusb_get_string: Success
   ...

Note that this particular UPS has no serial number string :

$ lsusb -v
.....
  idVendor           0x050d Belkin Components
  idProduct          0x0551 F6C550-AVR UPS
  bcdDevice            0.01
  iManufacturer           4 Belkin
  iProduct               18 UPS
  iSerial                 0 
  bNumConfigurations      1

but from the USB specs (USB2.0, section 9.6.7) -

String index zero for all languages returns a string descriptor that contains an array of two-byte LANGID codes supported by the device.

So I don't understand why libusb, when querying the serialnumber at index 0 (in effect, we're trying to query that array of LANGID codes), would return no data / 0.

jimklimov commented 1 month ago

The device has no serial flashed-in? Other vendors happen to have strings of zeroes or spaces, also not particularly useful...

fenugrec commented 1 month ago

So I don't understand why libusb, when querying the serialnumber at index 0 (in effect, we're trying to query that array of LANGID codes), would return no data / 0.

I was doubly wrong, because

  1. I didn't notice nut is doing the right thing and not querying the descriptor if iSerialNumber == 0 : https://github.com/networkupstools/nut/blob/master/drivers/libusb1.c#L428
  2. libusb does the right thing and returns an error if calling libusb_get_string_descriptor_ascii with index=0 : https://github.com/libusb/libusb/blob/43107c84e4a5f6b15c296eff8cc3578100f35dce/libusb/descriptor.c#L1166

Thus I am still puzzled about the source of this "nut_libusb_get_string" log message. I think your PR makes this a non-issue, but I'm still curious where this lone message is coming from.

[EDIT] oh, I think I have an idea - HIDGetIndexString() takes an index (I don't know from where), and passes it to comm_driver->get_string() without validating index != 0 . What do you think ?

The device has no serial flashed-in?

Indeed, it would appear they didn't bother with a serial on this low-end model.

jimklimov commented 1 month ago

In "1." you linked to nut_libusb_open() which should not be looping :)

fenugrec commented 1 month ago

(still haven't recompiled nut, I promise I will)

In "1." you linked to nut_libusb_open() which should not be looping :)

oops. Looking again today, I see libhid.c does some validation on indices against USB_CTRL_STRINDEX_MIN , but this latter value should probably be 1; 0 is the special case of querying the LANGID array and I don't think it's ever something we do ?

Or, at the least, I think nut_libusb_get_string should be validating StringIdx > 0, because callers seem to be expecting a string, or nothing (again never the LANGID array ?)

[EDIT] - just recompiled with a debug line in nut_libusb_get_string , and it looks like Index 26 is causing trouble, not 0 as I suspected.

fenugrec commented 1 month ago

(And I just compiled current master @ b488693, and as expected the offending log message doesn't appear anymore.