networkupstools / nut

The Network UPS Tools repository. UPS management protocol Informational RFC 9271 published by IETF at https://www.rfc-editor.org/info/rfc9271 Please star NUT on GitHub, this helps with sponsorships!
https://networkupstools.org/
Other
2.12k stars 354 forks source link

Please port to libusb 1.0 #300

Closed bigon closed 2 years ago

bigon commented 8 years ago

Hi,

nut currently uses libusb 0.1 which is deprecated for quite some time. Nut should be ported to libusb 1.0

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810449

jblachly commented 8 years ago

Agree; I was trying to debug a nut installation on illumos and it took me quite some time to even locate the source code for libusb 0.1.

aquette commented 8 years ago

thanks for the report guys, this is on my TODO stack for too long, and time has come to move on... I'll start pushing a libusb-1.0 branch over the day, and will start to call for testing as soon as possible.

clepple commented 8 years ago

@aquette I am curious if this is the best use of our time (both development and testing). libusb-1.0 is designed to overcome API issues relating to asynchronous calls (NUT operates synchronously, in a single thread), and to add support for USB 2.0+ features (Remember that most UPSes are USB 1.x, and many are 1.5 MBit/sec). If nothing else, this sounds like an excellent use case for libusb-compat, which is not marked deprecated, just that the 0.1 API is not recommended for new code.

If we are going to require users to test their setup again, let's at least make it worth their time.

clepple commented 8 years ago

@jblachly not sure how illumos does their package mirroring, but for future reference, the upstream source is at SourceForge and Debian keeps mirrors of the tarballs that go into their packages.

aquette commented 8 years ago

@jimklimov you may be able to shed some light on the Illumos package for libusb 0.1...

jimklimov commented 8 years ago

@jblachly @aquette : I should start by reminding that "illumos" is not an OS by itself, but rather a kernel and some basic tooling. Things like libusb are part of userland which comes (or not) with specific distributions, of which there are a dozen or so.

I mostly work with OpenIndiana (and its bleeding-edge Hipster subproject), which currently has recipes mostly from back in 2012, for versions 0.1.8 of libusb-wrapper and libusb-ugen, with sources for both coming from http://libusb.sf.net/ - see https://github.com/OpenIndiana/oi-userland/tree/oi/hipster/components/libusb/ugen and https://github.com/OpenIndiana/oi-userland/tree/oi/hipster/components/libusb/wrapper .

As I see on an OmniOS Bloody box, they have driver/usb and driver/usb/ugen as well as system/header/header-usb and system/header/header-ugen packages, though I can't quickly determine the source and version for those.

I believe both of these sets of packages/recipes can plausibly date back to Sun OpenSolaris packaging (SUNWlibusbugen SUNWugen{,u} SUNWusb{,s,u,vc} etc.) including non-bumped versions.

bigon commented 8 years ago

I don't think libusb-compat is in the debian archive

jblachly commented 8 years ago

@jimklimov I am working on SmartOS, but more importantly I am going to put some effort into the illumos kernel USB drivers in the area of HID-power. As you know, illumos recognizes HID-power class USB devices as "hid", but illumos' / opensolaris usba driver hid has only drivers for kb and mouse, IIRC. So in order to use an HID-power UPS with any illumos distribution and nut, one has to load the ugen driver.

My earlier comment about "locating the source" referred not to getting nut installed (SmartOS and pkgsrc 'nut' package got everything installed no problem) , but rather when trying to track down why nut didn't work for me (the problem being of course with the hid driver versus the ugen driver), as I needed to review the libusb 0.1 source code to figure out that it was looking for /dev/usb/<vid>,<pid>/ directory tree that doesn't exist when USB device is owned by hid driver.

Anyway, please let me know if I can help in any way, and thanks everyone for working on NUT.

aquette commented 8 years ago

@jblachly having worked / assisted / contributed to ugen on Solaris with the developer (Jan Van Bruen), back in the early 2000, and also on the Linux kernel implementation, I can help. But I don't think this effort is worth. As for NUT, most USB (and HID) userland implementation just needs a generic USB access, and implement HID support on their own. Otherwise, take a look at libhid and similar userland efforts. Moreover, HID kernel support tends to be overweighted. It would be better, as I wanted for years, to develop hotplug script (iirc, we have something to complete in NUT) so that ugen is loaded when a device requires it...

aquette commented 8 years ago

worklog on the port: I've committed the configure checks: ... checking for libusb version via pkg-config... 1.0.19 found checking for libusb cflags... -I/usr/include/libusb-1.0
checking for libusb ldflags... -lusb-1.0
checking for libusb.h... yes checking for libusb_init... yes checking for libusb_detach_kernel_driver... yes ...

Configuration summary:

... build USB drivers: yes (libusb-1.0) ...

Actual implementation for the 1.0 backend underway... stay tuned.

jblachly commented 8 years ago

@aquette Thanks -- Perhaps best solution is to ensure that HID-Power class devices get the ugen driver loaded instead of hid, without having to put vendor specific usb<vid>,<pid> strings in /etc/driver_aliases?

aquette commented 8 years ago

@jblachly: indeed, that would be optimal... but not sufficient. NUT has support for USB/HID PDC (power device class) and also for XCP, Qx, and more (drivers: bcmxcp_usb, blazer_usb, nutdrv_atcl_usb, richcomm_usb, riello_usb, tripplite_usb, nutdrv_qx)

worklog on the port: usbhid-ups port complete (available on the https://github.com/networkupstools/nut/tree/libusb-1.0 branch). I've made some non-regression on an Eaton Protection Station (Vs 2.7.4), and tested settings (upsrw) too. Everything's fine so far, but tests and feedback welcome.

aquette commented 8 years ago

Worklog tracker

clepple commented 8 years ago

@aquette please remember to update Makefile.am when adding/renaming files.

http://buildbot.networkupstools.org/public/nut/waterfall?branch=libusb-1.0

aquette commented 8 years ago

indeed @clepple , thx for pointing ;) should be good for the current scope, however bear in mind that the port is not finished, so buildbot will fail to compile the tree, while it's not complete...

aquette commented 8 years ago

all, I've completed the port and further reviews/adjustments, which should cover most if not all things, hopefully. There may still be a few things left, so I'll be reviewing a last time, but you can anyhow review and test on your side, prior to considering the mainline merge. Otherwise, it compiles and distcheck-light fine on both 1.0 and 0.1. usbhid-ups is good and tested (not big tests, but good coverage still); I'll try to test bcmxcp_usb too if I can get a hand on an XCP/USB.

aquette commented 8 years ago

@cleppe: I've completely missed your comment on "the use of our time". I indeed think that switching to libusb 1.0 (not compat, which can't be marked as deprecated, since part of libusb 1.0) is a good thing. Even if we're not using isochronous endpoint, 1.0 is the way forward, and as you saw in the commit list, the port was not a huge deal. This will ease some integration, esp. on Windows and probably some BSD too (inc. Darwin). So we're now ready for the future, on the USB front...

clepple commented 8 years ago

@aquette there may not have been much to commit, but there is a lot to test, and we can't simulate all of the USB error conditions that users may encounter.

I am still not sure this is a big win for users. On OS X, we still have the same problem with USB HID devices: libusb-style access competes with the HID driver. BSD has had a problematic mix of libusb 0.1 and 1.0 APIs, and again, error conditions have been just different enough to cause problems. And I am not sure that moving to libusb 1.0 automatically solves the USB access problem on Windows (though I would like to be proven wrong on that front).

Let's not rush the 2.7.5 release. This will need a lot of testing.

clepple commented 8 years ago

@aquette can you please check on this? might only be needed for libusb-0.1.

../../../drivers/libusb1.c:70:19: warning: unused function 'typesafe_control_msg' [-Wunused-function]

clepple commented 8 years ago

@aquette I commented over on the libusb1.c patch, but it isn't showing up here. Slightly updated:

HAVE_LIBUSB_DETACH_KERNEL_DRIVER matches libusb_detach_kernel_driver() rather than libusb_set_auto_detach_kernel_driver(). FreeBSD 10.1-10.3 have the former, but not the latter. I'm guessing we use the auto variant if it exists, and fall back to the earlier one (or nothing) if not.

Reference: https://www.freebsd.org/cgi/man.cgi?format=html&query=libusb(3)

aquette commented 8 years ago

@clepple

clepple commented 8 years ago

@aquette the last commit doesn't cover richcomm_usb, which doesn't use the normal libusb code: http://buildbot.networkupstools.org/public/nut/builders/FreeBSD-x64/builds/407/steps/compile/logs/stdio

clepple commented 8 years ago

@aquette Rather than make the altsetting code only build on libusb-0.1, it should probably use this: http://libusb.sourceforge.net/api-1.0/group__dev.html#ga3047fea29830a56524388fd423068b53

clepple commented 8 years ago

Retry behavior seems different, at least on FreeBSD. I ran the usbhid-ups driver on an MGE Pulsar Evolution 500, and it worked for a few hours, then exited with the following message: No USB device found: Input/output error. I will rerun with logging to see what is going on.

clepple commented 8 years ago

@aquette the exit issue from the previous comment seems to be related to this group of statements:

https://github.com/networkupstools/nut/blob/c654bdb729a7f116e70a3272fd3c33687aaf61b2/drivers/libusb1.c#L166-L168

The routine should log and return, rather than call fatal*(), since there is a time after the device disconnects when the bus is "empty". (Not really, but the only device that NUT has permission to see is the UPS itself - not the hubs.) In the older libusb code, it would return to the main driver loop with the UPS marked as stale.

Edit: fixed URL to point to exact commit at time of comment, not just the branch.

aquette commented 8 years ago

@clepple : just a quick heads up, back from small vacations today, thx for your tests and comments/feedback. I'll look at these and will fix the code...

clepple commented 8 years ago

@aquette thanks, testing commit cb7eb673 now. Will need at least a few hours for the reconnect to happen.

aquette commented 8 years ago

@clepple beware to test the very latest commit for the reconnect issue (883a57ea5cc2efa34660ec7329bcf0c442391eee), the one you cited is the altsetting one ;)

clepple commented 8 years ago

@aquette sure enough. I am testing on FreeBSD, and when I saw the FreeBSD x64 buildbot go green, that's the one I pulled (234da3d + cb7eb67). Retesting.

clepple commented 8 years ago

The tripplite_usb driver doesn't start:

 sudo ./tripplite_usb -a tl
Network UPS Tools - Tripp Lite OMNIVS / SMARTPRO driver 0.30 (2.7.4-97-g883a57e)
Warning: This is an experimental driver.
Some features may not function correctly.

Detected a UPS: TRIPP LITE/TRIPP LITE OMNIVS1000         
nut_libusb_get_interrupt: Success
libusb_get_interrupt() returned 0 instead of 8 while sending 3a 00 ff 0d 00 00 00 00 '........'

This might be a mismatch in the return types, but like usbhid-ups, this is going to need some longer-term testing to make sure it handles transient errors properly.

The good news is that usbhid-ups seems to be reconnecting properly (as of 883a57e).

aquette commented 8 years ago

@clepple , fully right, thanks for pointing. The issue was also present for all interrupt transfers, including in usbhid-ups. The point is now fixed

clepple commented 8 years ago

@aquette thanks, testing 1041de0c9 now.

On a related note, should we mention the libusb major version in one of the driver.version.* variables? This might be useful for bug reports.

clepple commented 8 years ago

Commit db20c07 seems to be needed for multiple HID UPSes.

aquette commented 8 years ago

cool, thanks for the fix @clepple

clepple commented 8 years ago

@aquette or others: any thoughts on how to represent the libusb version in the driver variables? Something like driver.version.libusb: 1.0? Or do we want to make the name more generic to cover other HID libraries (HIDAPI, Linux hiddev, etc) like driver.version.usb: libusb-1.0?

aquette commented 8 years ago

@clepple; I'm in favor of the generic approach "driver.version.usb: libusb-1.0", just in case. May help in the future...

clepple commented 8 years ago

What about SHUT mode? Just consider it a type of USB?

aquette commented 8 years ago

@clepple: SHUT is another marshaling type (transport layer), for HID, as USB is one. We can probably leave it out, though it doesn't hurt as you've implemented it in e2939e3 Another possible approach, more generic and applicable to all drivers, would be "driver.version.communication". That would allow to provide visibility for all external communication libraries (USB, SNMP, IPMI, ...) since we already have "driver.version.internal" that gives it for internal communication layers (such as SHUT for example). Comments and thought?

clepple commented 8 years ago

@aquette it occurred to me later that I could have just left off "driver.version.usb" for SHUT, since this code would need to be implemented in the driver (otherwise, we add a dstate* dependency to the libusb/libhid layers).

I would prefer to keep the variable name specific to USB rather than generic across all interface types. My goal is to be able to search through all of the DDL or mailing list archives, for instance, and if it is "driver.version.usb", it is a simpler regex than "driver.version.communication:.*usb".

aquette commented 8 years ago

@clepple works for me, both for the variable name, and the non-implementation for SHUT. BTW, how do you feel for pushing that code part of a public (2.7.5 or separate) release? at worst, we can still promote (advertise) one of the autogenerated tarballs for those who need it / want to test... Note that I was still not able to setup some endurance testing on Eaton units for this... still lacking time

clepple commented 8 years ago

@aquette I think the rest of the code is ready for a 2.7.5 release, but the reason why I haven't merged my latest commit to the main libusb1 branch is because we would want that same variable code for all of the USB drivers (not just usbhid-ups). Also, I haven't written any documentation for the new variable name. Hopefully I can get to that in the next few days.

I was also thinking we could get some feedback on the branch via this PPA: https://launchpad.net/~clepple/+archive/ubuntu/nut

karlshea commented 8 years ago

Your PPA has at least fixed the issue I had with the version included with Ubuntu 16.04. I updated a box from 14.04 (with a clean install) and the driver kept failing somehow. I saw someone else on the users mailing list having the same issue and that led me here.

image

First graph is from last night to today, middle shows the 16.04 issues, right includes 14.04.

wpbrown commented 8 years ago

@clepple I've been having non-stop problems with Ubuntu 16.04 (SuperMicro X10SDV + CyberPower 1000PFCLCD (usb hid)) getting stale data with no log messages from the driver process. I've been trying to build the libusb1.0 branch as a deb since seeing this mailing post, but then I found this issue. Your PPA has fixed my problems. Thank you.

clepple commented 8 years ago

@wpbrown you're welcome, but @aquette did the heavy lifting for the libusb-1.0 code - I just stuffed it into the PPA (and complained when the code didn't work with some of my hardware ;-) ) so that we could work out a few bugs before it hit the other distributions.

Because there is no specific improvement in libusb-1.0 that should make this work better than libusb-0.1, keep us posted if you find you have problems down the road. Thanks for testing.

jmaggard10 commented 7 years ago

FWIW, I believe I've tracked down why some folks are having better success on Ubuntu with libusb-1.0 than with libusb-0.1. It's all due to a bad change that went in between Debian Wheezy and Jessie, and between Ubuntu 14.04 and 16.04. From the Debian/Ubuntu libusb changelog:

libusb (2:0.1.12-24) unstable; urgency=medium ...

This patch changes some error handling behavior, and on some UPS devices ends up making libusb call the blocking IOCTL_USB_REAPURB ioctl after IOCTL_USB_REAPURBNDELAY fails. That call may wait there indefinitely, until the process receives a signal. Of course, this in turn makes the the NUT driver hang there indefinitely waiting for libusb to return.

clepple commented 7 years ago

@zykh I realize there are many different models to consider, but have you had a chance to test nutdrv_qx on the libusb-1.0 branch against any hardware? (My Q* UPS is serial-only.)

zykh commented 7 years ago

Not yet, @clepple, and I can't get to the USB devices I have direct access to for at least another 4 months (unless I go and buy something new from El Cheapo...). I'll ask my contacts to test it and I'll report back in due time.

aquette commented 7 years ago

note for visibility: we (Eaton) are currently checking to provide some Qx units to @zykh Sadly, it takes more time than I expected to support the present libusb1.0 issue. Calling to anyone for feedback on this...

zykh commented 7 years ago

Just an update on what @aquette mentioned earlier: today, the 3 test units were delivered to me and tests are already undergoing on one of them (hopefully, tomorrow, I will have enough time to setup the other two). My plan is to test them for a week or so in each of the currently supported scenarios (libusb0.x, libusb-compat, libusb1.x)... and, because I feel brave, I started with the libusb1.x branch -- so far, everything looks good.

wpbrown commented 7 years ago

FYI It looks like the libusb0.1 patch @jmaggard10 found in debian that is causing problems is still in for the upcoming stretch release.

Luckily the libusb1 build is still working.