Closed ryantm closed 7 years ago
Hy Ryan, thanks for the bug report!
First, I was a bit surprised that, if I start the program with the USB device selected, it keeps printing out the device. I expected it to only print the device when the device was plugged in.
I believe this is because you've specified the enumerate
flag which fires events for all matching currently attached devices. If you specify mempty
it will only fire when you attach the device.
When I disconnect the device it stops printing. When I reconnect the device, it does not start printing the device again. Given that it prints out the device connected continuously, I expected it to start again when the device was plugged back in.
That's strange. I haven't been able to reproduce that but I will try a bit harder this evening.
If I have two devices with the same Vendor ID connected when I start the program, the program outputs "Segmentation fault (core dumped)". I expected it to be able to handle multiple similar devices.
That shouldn't happen. It could also be a bug in libusb of course. It might be interesting if you can write the same program in C and see if it has the same behaviour as the Haskell program.
Hi Ryan, Hi Bas, I'd like to echo Ryan's remarks - Thank you for this usb library, especially the quality documentation.
To the point though, I'm also experiencing a segfault when using the library's asynchronous event notification features. Here are two programs, one in C and the other in Haskell, trying to keep the semantics as close as possible between the two.
This works.
Compiled with gcc 6.3.1: $cc this-works.c -Wall -ggdb -I/usr/include/libusb-1.0 -L/usr/lib -lusb-1.0
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <libusb.h>
int hotplug_callback(struct libusb_context *ctx, struct libusb_device *dev,
libusb_hotplug_event event, void *user_data) {
struct libusb_device_descriptor desc;
switch (event) {
case LIBUSB_HOTPLUG_EVENT_DEVICE_ARRIVED :
(void)libusb_get_device_descriptor(dev, &desc);
printf ("Device attached: %04x:%04x\n", desc.idVendor, desc.idProduct);
break;
case LIBUSB_HOTPLUG_EVENT_DEVICE_LEFT :
printf("Device detached\n");
break;
default :
printf("Unhandled event %d\n", event);
break;
}
return 0;
}
int main (void) {
libusb_hotplug_callback_handle handle;
int rc;
libusb_init(NULL);
rc = libusb_hotplug_register_callback(NULL,
LIBUSB_HOTPLUG_EVENT_DEVICE_ARRIVED | LIBUSB_HOTPLUG_EVENT_DEVICE_LEFT,
0, 0x15a2, 0x0061, LIBUSB_HOTPLUG_MATCH_ANY, hotplug_callback, NULL,&handle);
if (LIBUSB_SUCCESS != rc) {
printf("Error creating a hotplug callback\n");
libusb_exit(NULL);
return EXIT_FAILURE;
}
while (1) {
libusb_handle_events_completed(NULL, NULL);
usleep(10000);
}
libusb_hotplug_deregister_callback(NULL, handle);
libusb_exit(NULL);
return 0;
}
This segfaults. Compiled with GHC 8.0.2. and Stack resolver lts-8.20. I'll attach the stack.yml and cabal file.
module Main where
import Control.Exception
import Control.Concurrent (threadDelay)
import Control.Monad (when, forever)
import Data.Functor (($>))
import System.Exit
import System.IO
import System.USB
import Text.Printf
defaultVid = 0x15a2
defaultPid = 0x0061
main :: IO ()
main = reportDevices
reportDevices :: IO ()
reportDevices = do
--USB context
ctx <- newCtx
when (not $ ctx `hasCapability` HasHotplug)
(die ("platform does not support libusb-1.0 hotplugging"))
_ <- mask_ $ do
-- callback for both events, report function examines event to
-- handle both cases
ah <- registerHotplugCallback ctx
(deviceArrived `mappend` deviceLeft)
mempty
(Just defaultVid)
(Just defaultPid)
Nothing
(\dev event -> report dev event $> KeepCallbackRegistered)
return ah
forever $ threadDelay 100
report :: System.USB.Device -> System.USB.HotplugEvent -> IO ()
report d e
| (matchDeviceArrived e) = printf "%s arrived\n" (show d)
| (matchDeviceLeft e) = printf "%s departed\n" (show d)
test-usb-hotplug.cabal
name: test-usb-hotplug
version: 0.1.0.0
-- synopsis:
-- description:
homepage: https://github.com/githubuser/test-usb-hotplug#readme
license: BSD3
license-file: LICENSE
author: Author name here
maintainer: example@example.com
copyright: 2017 Author name here
category: Web
build-type: Simple
extra-source-files: README.md
cabal-version: >=1.10
executable test-usb-hotplug-exe
hs-source-dirs: .
main-is: Main.hs
ghc-options: -threaded -rtsopts -with-rtsopts=-N
build-depends: base >= 4.7 && < 5
, stm
, usb >= 1.3 && < 1.4
, vector
default-language: Haskell2010
stack.yaml
flags: {}
extra-package-dbs: []
packages:
- '.'
extra-deps:
- bindings-libusb-1.4.5.0
- usb-1.3.0.4
resolver: lts-8.20
Thanks, Ian.
Ps. Hope this was the correct place to leave this information.
@Tingles thanks for the detailed report including the C and Haskell programs! Unfortunately I don't have much time to look at this now (I'm on a holiday).
A few things that come to mind:
Could you test if the usb-example exhibits the same behaviour. I know it used to work two years ago.
~See the documentation of HotplugCallback. It mentions to use as little as IO as possible in the callback. Could you rewrite your Haskell program such that instead of invoking ~report
it will put (dev, event)
in an empty MVar. Then in the main thread you should take out the (dev, event)
from the MVar and report
it.
EDIT: Please ignore the latter. I noticed that @ryantm already used the MVar approach and hit the segfault.
@basvandijk @ryantm Looking back at @ryantm's original demo code I believe I was too quick to post and I missed the brunt of the point in with my C implementation. I'll have have another go at implementing the original Haskell code in C code
@ryantm I don't have VirtualBox on my system but I spun up a Ubuntu-16.04.2 LTS guest on a KVM to test this out.
First, I was a bit surprised that, if I start the program with the USB device selected, it keeps printing out the device. I expected it to only print the device when the device was plugged in.
I was not able to recreate this with the Ubuntu KVM guest. So because of that I can't address the first of your two points.
However I was able to duplicate your second point.
If I have two devices with the same Vendor ID connected when I start the program, the program outputs "Segmentation fault (core dumped)". I expected it to be able to handle multiple similar devices.
Here's what I got on a Ubuntu-16.04.2 LTS virtual KVM guest with ghc 7.10.3
and usb 1.3.0.4
.
I observed,
*** Error in
`./.stack-work/install/x86_64-linux/lts-6.25/7.10.3/bin/test-usb-hotplug-exe':
double free or corruption (out): 0x00007f5874001880 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f58969007e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f589690937a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f589690d53c]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(+0xb365)[0x7f5897a26365]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_unref_device+0x77)[0x7f5897a1f967]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_free_device_list+0x29)[0x7f5897a1f9c9]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_hotplug_register_callback+0x1e9)[0x7f5897a25f49]
./.stack-work/install/x86_64-linux/lts-6.25/7.10.3/bin/test-usb-hotplug-exe[0x46403a]
======= Memory map: ========
00400000-008bc000 r-xp 00000000 fc:01 13564
/home/ian/tmp/test-usb-hotplug/.stack-work/install/x86_64-linux/lts-6.25/7
.10.3/bin/test-usb-hotplug-exe
00abb000-00abc000 r--p 004bb000 fc:01 13564
/home/ian/tmp/test-usb-hotplug/.stack-work/install/x86_64-linux/lts-6.25/7
.10.3/bin/test-usb-hotplug-exe
....(removed for space)
[vsyscall]
Aborted (core dumped)
This last scenario... a "double free or corruption" error I've seen before when trying to do, what I suspsect, is more than I should have within the registered callback function - like trying to add/remove devices toa shared TMVar Set USB.Devices
One thing I notice in the original demo code you provided is that we've forgone making a Weak
pointer to the mvar
created in waitForMyDevice
which evalutes a finalizer (destructor like thing I think..) when the mvar
is garbage collected. (reference https://github.com/basvandijk/usb-example/blob/master/example.hs#L56 )
Another thing I was unsure of when reading the demo program was the exact semantics of the forever $ do ... delay
block. Still learning Haskell myself, I wasn't sure if we are actually looping in the forever $ do
block... e.g.
wait for device,
print it,
delay
re-register callback,
wait for device,
print it
delay ... etc`
Adding a putStrLn "register callback"
statement at the top of waitForMyDevice
and going through the last of the three scenarios I describe above, no devices plugged in when starting the program, shows the semantic of looping the forever $ do
block:
$./.stack-work/install/x86_64-linux-tinfo6/lts-6.25/7.10.3/bin/hs-test-usb-hotplug-exe
registering callback.
Bus 001 Device 068
registering callback.
*** Error in
`./.stack-work/install/x86_64-linux-tinfo6/lts-6.25/7.10.3/bin/hs-test-usb-hotplug-exe':
double free or corruption (out): 0x00007f4698
001890 ***
(removed for space)
Not sure if this was your intended semantics when you wrote the demo program, as I say it took me a double take to understand what was going on here.
Because of the looping semantics and the lack of a finalizer, I suspect we're unknowingly creating more mVar
resources than intended and running into some contention between the Haskell RTS/GC and Libusb-1.0's resource management.
enumerate
HotplugFlagHope you're not sick of my text walls yet :)...
I noticed in the first backtrace & memory map dump that the double free error happens in /libusb-1.0.so.0(libusb_free_device_list+0x29)
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f58969007e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f589690937a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f589690d53c]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(+0xb365)[0x7f5897a26365]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_unref_device+0x77)[0x7f5897a1f967]
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_free_device_list+0x29)[0x7f5897a1f9c9]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_hotplug_register_callback+0x1e9)[0x7f5897a25f49]
./.stack-work/install/x86_64-linux/lts-6.25/7.10.3/bin/test-usb-hotplug-exe[0x46403a]
So I tried was using the mempty
HotplugFlag instead of the enumerate
HotPlugFlag having noticed that the above segfault and backtrace indicate the double free error happening in /lib/x86_64-linux-gnu/libusb-1.0.so.0(libusb_free_device_list+0x29)[0x7f5897a1f9c9]
producing these results:
waitForMyDevice
until either:another identical device is attached, in this case the program prints the device, delays, registers a second callback and blocks again. To me this behaves as expected and is the non-interesting case.
the device is detached and reattached, in which case the program prints the device, delays, registers a second callback and perpetually blocks... unresponsive to again reattaching the device. To me this is unexpected and an interesting case. Conducting this procedure with libusb-1.0 debug output suggests deadlock within libusb-1.0 because debug output ceases, even while dmesg/dbus correctly continues.
When starting the program with 2 identical devices already plugged in, the program does not segfault immediately, but blocks until one of the connected devices is detached and reattached in which case it prints the device, delays and re-registers the callback. Detach/reattach cycling after this initial success causes libusb-1.0 debug output to cease and an apparent deadlock identical to that described above.
When starting the program with 2 identical devices already plugged in, the program does not segfault immediately, but blocks in a manner consistent with what you might expect. However if I plug in a 3rd identical device at this point instead of a detach/reattach cycle, the program immediately segfaults.
When starting the program with no devices plugged in, the program blocks consistent as above, correctly handling the first attach hotplug event. Attaching a 2nd identical, or trying a detach/reattach cycle both result in a segfault.
To tired atm to try to and further make heads or tails of this. Most of the above hopefully makes sense... needed a brain.
This weekend maybe I'll get to writing the equivalent C program.
Thanks to your investigation I think I know what is happening and how to solve it:
Everytime libusb gives me a libusb_device
I turn it into a value of the Haskell type Device
by applying mkDev
to it:
mkDev :: Ctx -> Ptr C'libusb_device -> IO Device
mkDev ctx devPtr =
Device ctx <$> FC.newForeignPtr devPtr (c'libusb_unref_device devPtr)
As you can see I create a ForeignPtr
to the libusb_device
and register a finalizer to unref the libusb_device
once the ForeignPtr
is garbage collected.
I think this is wrong.
What I think is happening is as follows: when you attach a device with the same vendor ID and product ID libusb will give us the same pointer to the libusb_device
. However, in Haskell I create to separate Devices
from it; both with their own finalizer! Do you see the problem happening? When one Device
is garbage collected it will unref the libusb_device
. However when the second Device
is garbage collected it will unref the already unreffed libusb_device
and cause the double free.
Could you try if the fix-17 branch fixes your program?
@basvandijk - Tested fix-17 and unfortunately I can still make @ryantm's demo program segfault. I can attach my testing results if you think they'll be helpful.
You're line of thinking is on the right track though I think.
I've patched libusb1.0 like so
diff --git a/libusb/core.c b/libusb/core.c
index 2dac909..a4dcb42 100644
--- a/libusb/core.c
+++ b/libusb/core.c
@@ -1153,9 +1153,11 @@ out:
DEFAULT_VISIBILITY
libusb_device * LIBUSB_CALL libusb_ref_device(libusb_device *dev)
{
+ usbi_dbg("device %d.%d refcount: %d", dev->bus_number, dev->device_address, dev->refcnt);
usbi_mutex_lock(&dev->lock);
dev->refcnt++;
usbi_mutex_unlock(&dev->lock);
+ usbi_dbg("device %d.%d refcount: %d", dev->bus_number, dev->device_address, dev->refcnt);
return dev;
}
@@ -1171,9 +1173,11 @@ void API_EXPORTED libusb_unref_device(libusb_device *dev)
if (!dev)
return;
+ usbi_dbg("device %d.%d refcount: %d", dev->bus_number, dev->device_address, dev->refcnt);
usbi_mutex_lock(&dev->lock);
refcnt = --dev->refcnt;
usbi_mutex_unlock(&dev->lock);
+ usbi_dbg("device %d.%d refcount: %d", dev->bus_number, dev->device_address, dev->refcnt);
if (refcnt == 0) {
usbi_dbg("destroy device %d.%d", dev->bus_number, dev->device_address);
and the usb
package like so
diff --git a/System/USB/Base.hs b/System/USB/Base.hs
index e3ec242..6e92e4c 100644
--- a/System/USB/Base.hs
+++ b/System/USB/Base.hs
@@ -661,7 +661,8 @@ withDevicePtr (Device ctx devFP ) f = do
mkDev :: Ctx -> Ptr C'libusb_device -> IO Device
mkDev ctx devPtr =
- Device ctx <$> FC.newForeignPtr devPtr (c'libusb_unref_device devPtr)
+ Device ctx <$> FC.newForeignPtr devPtr (printf "dbg: [mkDev][finalizer]: Not calling unref_device on %s.\n" (show devPtr))
+ --(c'libusb_unref_device devPtr)
rebuilt libusb-bindings-1.4.5.0
, usb-1.3.0.4
, && the demo program to link with this patched libusb-1.0.
Here's the logged output of plugging in a device and subsequently removing it:
registering callback.
[timestamp] [threadID] facility level [function call] <message>
--------------------------------------------------------------------------------
[ 0.005302] [00004627] libusb: debug [libusb_handle_events_timeout_completed]
doing our own event handling
[ 0.005461] [00004627] libusb: debug [handle_events] poll fds modified,
reallocating
[ 0.005468] [00004627] libusb: debug [handle_events] poll() 2 fds with timeout
in 0ms
[ 0.005473] [00004627] libusb: debug [handle_events] poll() returned 0
[ 3.463655] [0000462b] libusb: debug [linux_get_device_address] getting address
for device: 1-4.6 detached: 0
[ 3.463671] [0000462b] libusb: debug [linux_get_device_address] scan 1-4.6
[ 3.463722] [0000462b] libusb: debug [linux_get_device_address] bus=1 dev=9
[ 3.463728] [0000462b] libusb: debug [udev_hotplug_event] udev hotplug event. action: add.
[ 3.463732] [0000462b] libusb: debug [linux_enumerate_device] busnum 1 devaddr 9 session_id 265
[ 3.463736] [0000462b] libusb: debug [linux_enumerate_device] allocating new device for 1/9 (session 265)
[ 3.463774] [0000462b] libusb: debug [libusb_ref_device] device 1.3 refcount: 3
[ 3.463777] [0000462b] libusb: debug [libusb_ref_device] device 1.3 refcount: 4
[ 3.463779] [0000462b] libusb: debug [linux_get_parent_info] Dev 0x7f20e80017d0 (1-4.6) has parent 0x140dec0 (1-4) port 6
[ 3.463902] [00004627] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 3.463910] [00004627] libusb: debug [handle_events] poll() 2 fds with timeout in 0ms
[ 3.463913] [00004627] libusb: debug [handle_events] poll() returned 1
[ 3.463916] [00004627] libusb: debug [handle_events] caught a fish on the event pipe
[ 3.463918] [00004627] libusb: debug [handle_events] hotplug message received
Bus 001 Device 009
dbg: [mkDev][finalizer]: Not calling unref_device on 0x00007f20e80017d0.
registering callback.
[ 6.025026] [0000462b] libusb: debug [linux_get_device_address] getting address for device: 1-4.6 detached: 1
[ 6.025043] [0000462b] libusb: debug [udev_hotplug_event] udev hotplug event. action: remove.
[ 6.025047] [0000462b] libusb: debug [libusb_ref_device] device 1.9 refcount: 1
[ 6.025049] [0000462b] libusb: debug [libusb_ref_device] device 1.9 refcount: 2
[ 6.025060] [0000462b] libusb: debug [libusb_unref_device] device 1.9 refcount: 2
[ 6.025063] [0000462b] libusb: debug [libusb_unref_device] device 1.9 refcount: 1
[ 6.025126] [00004627] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 6.025137] [00004627] libusb: debug [handle_events] poll() 2 fds with timeout in 0ms
[ 6.025141] [00004627] libusb: debug [handle_events] poll() returned 1
[ 6.025143] [00004627] libusb: debug [handle_events] caught a fish on the event pipe
[ 6.025145] [00004627] libusb: debug [handle_events] hotplug message received
[ 6.025150] [00004627] libusb: debug [libusb_unref_device] device 1.9 refcount: 1
[ 6.025152] [00004627] libusb: debug [libusb_unref_device] device 1.9 refcount: 0
[ 6.025154] [00004627] libusb: debug [libusb_unref_device] destroy device 1.9
[ 6.025156] [00004627] libusb: debug [libusb_unref_device] device 1.3 refcount: 4
[ 6.025158] [00004627] libusb: debug [libusb_unref_device] device 1.3 refcount: 3
^C[ 9.364079] [00004627] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 9.364092] [00004627] libusb: debug [handle_events] poll() 2 fds with timeout in 0ms
[ 9.364096] [00004627] libusb: debug [handle_events] poll() returned 1
[ 9.364098] [00004627] libusb: debug [handle_events] caught a fish on the event pipe
[ 9.364100] [00004627] libusb: debug [handle_events] hotplug message received
calling out the important lines:
[ 6.025150] [00004627] libusb: debug [libusb_unref_device] device 1.9 refcount: 1
[ 6.025152] [00004627] libusb: debug [libusb_unref_device] device 1.9 refcount: 0
[ 6.025154] [00004627] libusb: debug [libusb_unref_device] destroy device 1.9
from this I believe the issue is the registered finalized call to libusb_unref_device
when the ForeignPtr
is garbage collected. As you can see above, the references are decremented to zero even without a finalizer. This seems strange to me and un-intuitive, to not release a resource acquired across an FFI boundary.
Interesting! It's really unfortunate that this is not documented in the libusb documentation on hotplugging.
So maybe we should first increment the reference count of a hotplug allocated libusb_device
before we turn it into a Haskell Device
with the associated finalizer.
I implemented this in the really-fix-17 branch. Could you give it a spin?
@Tingles @ryantm could you test if the really-fix-17
branch solves the problem? Thanks.
@basvandijk Sorry I haven't gotten to the testing it was a Holiday weekend. I'll give it ago in now.
@basvandijk I can no longer make the demo program segfault and the reference counts look to be appropriately decremented to 0 when removing devices from my hub. This is tested on a physical host as opposed to a KVM virtual guest. I will have to test that later on as I have time.
@ryantm - More data is always better in my mind. If you happen to get a chance please let us know what you observe.
@Tingles that's good news! Thanks for testing.
I think I will go ahead and merge this to master and start preparing a release. Thanks for the report and excellent investigation.
Released as usb-1.3.0.5.
Hello.
Thanks for making this library, it has been helpful for learning how to use libusb and talk to USB devices. Here is a bug report:
I compiled this program with usb-1.3.0.4 and ghc 7.10.3, using
ghc
with no options:waitForMyDevice
is copied verbatim from: https://hackage.haskell.org/package/usb-1.3.0.4/docs/System-USB-Enumeration.html#t:HotplugCallbackI am running the program on Ubuntu inside of VirtualBox
First, I was a bit surprised that, if I start the program with the USB device selected, it keeps printing out the device. I expected it to only print the device when the device was plugged in.
I noticed two problems:
Ryan