luxonis / XLink

A cross-platform library for communicating with devices over various physical links.
Apache License 2.0
12 stars 18 forks source link

libusb logs `libusb: warning [libusb_exit] device 2.0 still referenced` on app exit #62

Open diablodale opened 1 year ago

diablodale commented 1 year ago

usbLinkOpen() in the develop branch calls libusb_ref_device() to increment the ref, but I can not find any code that will decrement it using libusb_unref_device().

This is suspicious and may have unwanted sideaffects. libusb doc writes that ref/unref need to be matched and when ref=0 a device is finally destroyed.

libusb_ref_device() only exists one place...it is within refLibusbDeviceByName() https://github.com/luxonis/XLink/blob/457b23fb33e1146610e1a4e52defa7565046977a/src/pc/protocols/usb_host.cpp#L244

That function above refLibusbDeviceByName() is called by code in three locations...

usb_boot() location

Looks correct to me. It has a matching unref. https://github.com/luxonis/XLink/blob/457b23fb33e1146610e1a4e52defa7565046977a/src/pc/protocols/usb_host.cpp#L632-L665

usbLinkBootBootloader() location

Looks correct to me. It has a matching unref. https://github.com/luxonis/XLink/blob/457b23fb33e1146610e1a4e52defa7565046977a/src/pc/protocols/usb_host.cpp#L714

usbLinkOpen location

Looks suspicious. It calls refLibusbDeviceByName which increments the ref. But nothing calls unref. If I guess, I would think there should be an unref in usbLinkClose but there is not. https://github.com/luxonis/XLink/blob/457b23fb33e1146610e1a4e52defa7565046977a/src/pc/protocols/usb_host.cpp#L687

diablodale commented 1 year ago

Its very consistent. Now with more debug code and knowledge,, I can crash an app by using unref in usbLinkClose. Using the debuggers on the libusb code itself, I can see

`usbLinkClose()` function is entered with refcount=1
libusb_close()         ref = ref - 1     ==>   0   this causes libusb to destroy and deallocate the device
libusb_unref_device()  ref = ref - 1     ==>  -1   💥

If I leave the code as it is (having no unref), then on app exit with libusb logging, I am warned...

libusb: warning [libusb_exit] device 2.0 still referenced
libusb: warning [libusb_exit] device 3.0 still referenced

These are unknown devices to me. Does anyone know what a .0 device is? If I were to guess, this is related to the host controllers...probably called "hubs" by libusb. I have 3 host controllers. So hub "1" was properly deallocated. But 2 and 3 not??

diablodale commented 1 year ago

Actually...the ref counts are correct. The wrappers are working correctly. It was me and my paper tracking that was wrong. I still get the warnings on exit for those two x.0 devices. And if I enable debug level logging for libusb itself, I see some thread/mutex contention with log entries like...

[ 1.414049] [00004abc] libusb: debug [usbi_wait_for_events] WaitForMultipleObjects() for 2 HANDLEs with timeout in 60000ms   
[ 1.414215] [00004c0c] libusb: debug [libusb_release_interface] interface 0
[ 1.414427] [00004c0c] libusb: debug [libusb_close]
[ 1.414561] [00004abc] libusb: debug [usbi_wait_for_events] WaitForMultipleObjects() returned 0
[ 1.414684] [00004abc] libusb: debug [handle_event_trigger] event triggered
[ 1.414812] [00004abc] libusb: debug [handle_event_trigger] someone is closing a device
[ 1.414939] [00004abc] libusb: debug [libusb_try_lock_events] someone else is closing a device
[ 1.415065] [00004abc] libusb: debug [libusb_handle_events_timeout_completed] event handler was active but went away, retrying
[ 1.415076] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.4
[ 1.415369] [00004abc] libusb: debug [libusb_handle_events_timeout_completed] doing our own event handling
[ 1.415533] [00004c0c] libusb: debug [libusb_unref_device] destroy device 1.0
[ 1.415678] [00004abc] libusb: debug [usbi_wait_for_events] WaitForMultipleObjects() for 2 HANDLEs with timeout in 60000ms   
[ 1.415837] [00004c0c] libusb: debug [libusb_unref_device] destroy device 2.0
[ 1.416109] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.20
[ 1.416248] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.1
[ 1.416388] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.3
[ 1.416502] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.2
[ 1.416619] [00004c0c] libusb: debug [libusb_unref_device] destroy device 3.30
[ 1.422763] [000024f0] libusb: debug [winusb_get_device_list] allocating new device for session [5]
[ 1.423238] [000024f0] libusb: debug [winusb_get_device_list] found existing device for session [6]
themarpe commented 1 year ago

AFAIK some ops of libusb do deref implicitly - was this the case here at the end?

EDIT: I see you've modified the title - so some devices are still left referenced at the end, but for some the refcount goes up and down correctly?

diablodale commented 1 year ago

I'm 90% confident this is a libusb bug. I see others reporting similar warnings on exit. https://github.com/libusb/libusb/issues/1165 https://github.com/libusb/libusb/issues/988

To get the warnings, 2 or more usb OAK devices have to be attached. 1 device does not repro the issue. Set LIBUSB_DEBUG=3 and watch the log in debugger.

My XLink fork calls libusb_exit() on app exit to cleanup. The Luxonis fork does not call libusb_exit() and relies on the OS.

The app in isolation, this probably doesn't cause issues as the app exit ?should? free resources. But if two apps are running and using libusb or probing the usb hubs, it is unknown if the first app holds open the hub resources and would prevent another app using them. 🤷‍♀️