Open Zaitchikk opened 5 years ago
Hey anyone! Can I hope that the problem will ever be fixed?
Please contact whoever maintains NUT. They are responsible for this working or not in the first instance. If they come back and say it is a kernel bug then feel free to file a bug here, but as things stand this is not a kernel bug so this is not the correct place to report it.
but as things stand this is not a kernel
If in the earlier version of Linux for Raspberry, the program worked normally with USB ports, and after updating this operating system, the program stops working, most likely, there have been some changes in the kernel of the operating system. And if we also take into account that the program works fine on Linux for X86 / 64 platforms, we can conclude that the Linux kernel for Raspberry does not work with USB ports in a standard way.
It's difficult to debug without the hardware.
The suspicion would be that it is something that has changed between the 4.9 and 4.14 kernels. It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC which will be using a similar kernel (actually 4.18 and 5.0 respectively IIRC)
You can use rpi-update to download particular releases using hashes from https://github.com/Hexxeh/rpi-firmware/commits/master sudo rpi-update 5c80565
should get the last 4.9 release.
PLEASE DO NOT DO THIS ON A CRITICAL SYSTEM.
sudo apt install --reinstall raspberrypi-kernel raspberrypi-bootloader
should get you back to apt's version of the kernel and firmware.
It's difficult to debug without the hardware.
You can check on any other UPS that is supported by the driver "usbhid-ups".
It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC
I tested this UPS on the "Raspberry Pi Desktop". The system worked under "VMware® Workstation 15 Player 15.1.0 build-13591040" Everything works. Here is the version of that system.
$ uname -a
Linux raspberry 4.9.0-9-amd64 #1 SMP Debian 4.9.168-1+deb9u2 (2019-05-13) x86_64 GNU/Linux
It'd be interesting to know if it worked against something like Ubuntu 18.10 or 19.04 on an x86 PC
Now I tested on Debian 10 under the same Workstation 15 Player. Everything works great.
$ uname -a
Linux v 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5 (2019-06-19) x86_64 GNU/Linux
Hi, one of the NUT developers here. @Zaitchikk can confirm, but the base version of NUT in the two versions of Raspbian and Debian 10 is the same (2.7.4). As I understand it, the other differences between the working version and the non-working version of Raspbian are the kernel version, and potentially libusb-0.1, which NUT 2.7.4 uses to talk to this particular UPS. What about the versions of libusb? (dpkg -l libusb-0.1\*
) If not libusb, it is likely a kernel regression.
Hi, @clepple - good to have you here. From the usbhid-ups debug output above it seems like the tool is failing quite early. Could you (or anybody) run it under strace on working and non-working systems so we can see the system calls that lead to the error?
# On the old, working system
$ sudo strace -o strace_good.txt /lib/nut/usbhid-ups -DD -a spider
# And on the new, broken system
$ sudo strace -o strace_bad.txt /lib/nut/usbhid-ups -DD -a spider
Feel free to change any of the usbhid-ups parameters as you see fit, as long as they are the same in both cases.
What about the versions of libusb?
libusb-0.1-4:a 2:0.1.12-30 armhf userspace USB programming library
And on the new, broken system $ sudo strace -o strace_bad.txt /lib/nut/usbhid-ups -DD -a spider
I cannot quickly send debug output for the working system (Raspbian (2018-03-13)), because I will have to install it. But I will do it now, and will send a little later.
Thanks. There are a lot of EACCESS and EPERM errors in the trace - can you just confirm that you ran it with sudo?
can you just confirm that you ran it with sudo?
Yes
By the way, I can quickly send debug output to the working desktop Debian 10 (under VMware® Workstation 15 Player). I do not know if this is necessary.
Yes please - the two systems ought to look about the same at this level.
Here for the desktop Debian10 strace_good.txt
Thanks again - the output is very different, but I can see that the syscalls that return EPIPE on the broken system:
write(2, "Checking device (0D9F/0004) (001"..., 38) = 38
open("/dev/bus/usb/001/004", O_RDWR) = 4
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4) = -1 EPIPE (Broken pipe)
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4) = -1 EPIPE (Broken pipe)
ioctl(4, USBDEVFS_CONTROL, 0x7e9807c4) = -1 EPIPE (Broken pipe)
are succeeding on the Debian 10 system:
write(2, "Checking device (0D9F/0004) (002"..., 38) = 38
openat(AT_FDCWD, "/dev/bus/usb/002/004", O_RDWR) = 4
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 6
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 34
ioctl(4, USBDEVFS_CONTROL, 0x7ffc26aaaa50) = 6
Did you try downgrading to a 4.9 kernel as @6by9 suggested:
You can use rpi-update to download particular releases using hashes from https://github.com/Hexxeh/rpi-firmware/commits/master sudo rpi-update 5c80565 should get the last 4.9 release. PLEASE DO NOT DO THIS ON A CRITICAL SYSTEM. sudo apt install --reinstall raspberrypi-kernel raspberrypi-bootloader should get you back to apt's version of the kernel and firmware.
A safer thing to try would be to use the upstream dwc2 driver instead of the heavily optimised dwc_otg driver that is the default on Raspberry Pi kernels. Simply add dtoverlay=dwc2
to config.txt and reboot. USB performance will be reduced, so this isn't a long-term fix, but it might help to pinpoint the cause.
Simply add
dtoverlay=dwc2
to config.txt and reboot.
Yes it works !! I suspected there was a way to get 3B + to work as before. I think it should somehow better describe this method in the documentation. The problem could be solved a couple of months ago.
I bought Raspberry in order to get some SNMP card functionality for the UPS. USB speed is not important to me at all.
And by the way ... if it's still interesting. I promised. Here is the debug output for a working system (Raspbian (2018-03-13)). strace_good.txt
That's an interesting result, as it narrows it down to a driver regression.
If I could ask for one more thing, could you do the same strace capture using the dwc2 driver? strace_dwc2.txt would be a good name for the output file.
OK! Here it is strace_dwc2.txt
Thank you very much - that's a big help. It shows that dwc2 today behaves the same as the dwc_otg used to, which should guide our USB expert when he looks into the problem.
Can you test rpi-update firmwares in sequence to find the first failing one?
Use hashes from this repository: https://github.com/Hexxeh/rpi-firmware/commits/master
And do sudo rpi-update <hash>
to install a specific version.
Can you test rpi-update firmwares in sequence to find the first failing one?
I checked only official releases downloaded from http://downloads.raspberrypi.org/raspbian/images/ Raspbian (2018-03-13) worked. uname -a Linux raspberrypi 4.9.80-v7+ #1098 SMP Fri Mar 9 19:11:42 GMT 2018 armv7l GNU/Linux The next version - Raspbian (2018-04-18) no longer worked.
I tried your way. And this version is no longer working uname -a Linux raspberrypi 4.14.27-v7+ #1100 SMP Fri Mar 16 13:51:48 GMT 2018 armv7l GNU/Linux
It occurred to me that I jumped over the version. I set the previous one. And it doesn’t work either. uname -a Linux raspberrypi 4.14.26-v7+ #1099 SMP Wed Mar 14 14:59:28 GMT 2018 armv7l GNU/Linux
Simply add
dtoverlay=dwc2
to config.txt and reboot.
Not for long I was happy. It turned out that with this setting, the NUT driver, although it starts up, falls off periodically. After that, the debug output is as follows: strace_bad_dwc2.txt It is treated only by rebooting Raspberry.
That's irritating. The breakage corresponds to a kernel major version update and not anything to do with a dwc_otg driver change.
Using the dwc_otg driver, can you try capturing the USB "traffic" in both cases using with usbmon/wireshark, then upload the traces somewhere? https://discussions.flightaware.com/t/monitoring-usb-utilization-on-raspberry-pi/16421
Instead of filtering on address/endpoint, just dump the whole output from usbmon1 until after the transmission breaks (or starts normally on 4.9). Note: don't enter any passwords via keyboard when wireshark is capturing as these will appear in the packets. You may also want to disconnect the network cable while capture is ongoing to reduce the amount of spam created.
https://discussions.flightaware.com/t/monitoring-usb-utilization-on-raspberry-pi/16421
I cannot repeat this instruction. It seems to me this line of script is not working.
$ sudo mount -t debugfs none_debugs /sys/kernel/debug
mount: none_debugs is already mounted or /sys/kernel/debug busy
At least I don't see any of the instructions in the wireshark window
It appears that debugfs is already mounted so you can skip that step.
You may need to run wireshark as root (from a terminal on the desktop) in order to see the usbmon interfaces.
to run wireshark as root
Yes, it did.
Now explain in more detail what conditions should be. You want me to start debugging in the current configuration (Raspbian 2019-07-10 with the native dwc_otg driver), at the time the NUT driver starts. Did I understand correctly?
Yes, capture with the default driver configuration. Ideally, start the capture with the device unplugged, then plug the device in, then start NUT. You can stop the capture after the communication breaks in the newer version of kernel (4.14.26) and after communication starts successfully in the older version of kernel (4.9.80).
Here is the first attempt. I rebooted Raspberry with the UPS turned off, then did as you wrote. The file contains the result. If you got what you need, let me know. I will repeat the process on 4.14.26 4.9.80_successful_start.pcapng.zip
I tried to install an older system, designated as "firmware: Rebuild from correct tree" dated March 10, 2018. Its version:
$ uname -a
Linux raspberrypi 4.14.24-v7 + # 1097 SMP Mon Mar 5 16:42:05 GMT 2018 armv7l GNU / Linux
And the NUT driver did not start either. Could there be a problem in "rebuild ... ing"? Here is the wireshark debug output: 4.14.24_Failed_start.pcapng.zip
As I said, NUT start with the dwc2 driver does not work stably.
I was have a little dance with a tambourine around the fire, and I found one new and very strange circumstance. Perhaps it will prompt developers to solve this annoying problem.
I was able to run NUT with the default driver (dwc_otg) with my Powercom SPD-850U UPS. The sequence is as follows:
sudo upsdrvctl start
After that NUT works, but I'm not sure which is stable.The following sequence is also possible:
sudo / home / pi / hub-ctrl -h 1 -P 2 -p 0
sudo / home / pi / hub-ctrl -h 1 -P 2 -p 1
Source here https://github.com/codazoda/hub-ctrl.c https://github.com/codazoda/hub-ctrl.c/wiki/Raspberry-Pi
sudo upsdrvctl start
After that NUT works, but I'm not sure which is stable.After these procedures, the NUT can be reconnected to any port. The driver picks it up automatically ... BUT only until the next restart of Ruspberry.
Kernel version
$ uname -a
Linux raspberrypi 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux
Whoops, forgot about this one - what with Pi4 release and all.
The traces you've provided are mostly identical up until the point the host requests a string descriptor for index 0x0. In the working case, a 6-byte string descriptor is returned (two LANGIDs), in the broken case you get -EPIPE (protocol error). All subsequent transfers are broken in the same manner.
Can you try with latest rpi-update firmware? It bumps kernel to 4.19 and also has a few dwc_otg fixes for small transfers such as these.
@Zaitchikk Did you have chance to try with the latest firmware? I suspect that apt update apt full-upgrade would be sufficient, rather than using rpi-update.
I tried "apt full-upgrade" now. No effect. Only those magic dances with a tambourine, about which I wrote above, allow me to launch NUT.
Sorry the late bump, but i'm having exactly this issue and since i have a raspberry pi 4, i don't know if the kernel reversal solution could work for me.
Describe the bug I have a Powercom SPD-850U UPS. I try to connect this UPS with my Raspberry PI 3B + via NUT, I came to the conclusion - all works nice with Raspbian (2018-03-13), the first one that supports 3B +, but does not work with the latest version of Raspbian (2019-04-08). I think, the algorithm for working with the USB bus has changed in the latest versions of the kernel. And this is bad, because it breaks down the work of previously debugged programs.
To reproduce I use the following NUT driver settings. This works on both the Debian desktop and Raspbian(2018-03-13) with this UPS.But does not work on Raspbian (2019-04-08)
sudo nano /etc/nut/ups.conf [spider] driver = usbhid-ups port = auto desc = "Poaercom SPD-850U"
Expected behaviour The NUT driver start command must start the NUT driver
$ sudo upsdrvctl start Network UPS Tools - UPS driver controller 2.7.4 Network UPS Tools - Generic HID driver 0.41 (2.7.4) USB communication driver 0.33 Using subdriver: PowerCOM HID 0.5
Actual behaviour Attempt to launch NUT driver
sudo upsdrvctl start Network UPS Tools - UPS driver controller 2.7.4 Network UPS Tools - Generic HID driver 0.41 (2.7.4) USB communication driver 0.33 No matching HID UPS found Driver failed to start (exit status=1)
System
Raspberry PI3B+
$ cat /etc/rpi-issue Raspberry Pi reference 2019-04-08 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, e910bf55ac92e27c6e77043e8995bcb5e0a2d742, stage5
$ vcgencmd version Mar 27 2019 15:45:53 Copyright (c) 2012 Broadcom version 2e98b31d18547962e564bdf88e57b3df7085c29b (clean) (release) (start)
$ uname -a Linux raspberrypi 4.14.98-v7+ #1200 SMP Tue Feb 12 20:27:48 GMT 2019 armv7l GNU/Linux
Logs
$ dmesg [ 937.439522] usb 1-1.1.2: new low-speed USB device number 6 using dwc_otg [ 937.600259] usb 1-1.1.2: New USB device found, idVendor=0d9f, idProduct=0004 [ 937.600275] usb 1-1.1.2: New USB device strings: Mfr=3, Product=1, SerialNumber=2 [ 937.600284] usb 1-1.1.2: Product: HID UPS Battery [ 937.600292] usb 1-1.1.2: Manufacturer: POWERCOM Co.,LTD [ 937.600301] usb 1-1.1.2: SerialNumber: 004-0D9F-000 [ 937.808194] hid-generic 0003:0D9F:0004.0002: hiddev96,hidraw0: USB HID v1.00 Device [POWERCOM Co.,LTD HID UPS Battery] on usb-3f980000.usb-1.1.2/input0
$ sudo /lib/nut/usbhid-ups -DD -a spider Network UPS Tools - Generic HID driver 0.41 (2.7.4) USB communication driver 0.33 0.000000 debug level is '2' 0.002877 upsdrv_initups... 0.004120 Checking device (0D9F/0004) (001/006) 0.006031 - VendorID: 0d9f 0.006103 - ProductID: 0004 0.006176 - Manufacturer: unknown 0.006273 - Product: unknown 0.006366 - Serial Number: unknown 0.006461 - Bus: 001 0.006557 - Device release number: 0002 0.006650 Trying to match device 0.006765 Device matches 0.007697 Unable to get HID descriptor (error sending control message: Broken pipe) 0.007879 HID descriptor length 996 0.008555 Unable to get Report descriptor: Broken pipe