Open ecejeff opened 7 years ago
By the way, I don't get timeout callbacks for the 2 transfers that "time out". I only get error callbacks on the 4 that "stalled".
From a cursory glance at the source code (and I am not well versed on libusb): it appears to me that the two transfers must never get removed from the flying transfers list, which means they refer to stale file descriptors. In the windows_handle_events
method, all flying transfers are looped through and matched to file descriptors - this is where a failure on reconnect occurs. I wonder if after the AbortPipe
operation fails, the transfers should somehow find their way to the usbi_handle_transfer_cancellation
function so that they are removed from the flying transfers list. I could be completely wrong on this however.
In any case, libusb Windows does not support hotplug as of now.
Close this for now. Will reopen this if things still happen after the hotplug feature is implemented.
I have no idea if this is related to hotplug or not, but I figure I'll comment our "fix" anyways: the patch is here. Again, I have no idea if this is the proper fix or not, but to sum it up: During AbortPipe
, something bad would happen to trigger a "no device" error. This would cause the usbi_handle_transfer_cancellation
to never get called, so the transfer is actually left hanging in that state. Another transfer would then attempt to use the transfer fd, which was never cleaned up. So, when the "no device" error from AbortPipe
happens, a list of these bad transfers is maintained and then the usbi_handle_transfer_cancellation
is called on each bad transfer. This has solved most of our problems, though I am not sure if it's really the best way to fix this.
I don't see how hotplug (callbacks on new devices and devices leaving) is related to this bug in any way.
With all the recent changes to how Windows handles transfers and events, I'd like to know if this is still an issue. Can anyone on this thread confirm?
Using the latest git I now get a different error when unplugging one of two devices now:
libusb: error [windows_iocp_thread] GetQueuedCompletionStatus failed: [31] A device attached to the system is not functioning.
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
libusb: warning [handle_timeout] async cancel failed -5 errno=2
And the other device becomes hung as well.
Here's a more detailed log (not the same run, but the same effect).
Thanks for the quick feedback! I just pushed a commit (37e8b1334e59485e1d4735f5f67b31eba37ae5c9) to address an issue pointed out by your log. Can you try again?
It fixed the unplugging issue, but I've uncovered another in the process of testing the fix. It happens during tear down.
C:\Libraries\Asphodel_Win64_b727e2c1d314c36642ae0b3571db0676e5aa66a5>asphodel_streaming.exe
Found 2 devices!
Enabling 2 streams from UVB687
Enabling 3 streams from WMRP2147
Press any key to stop data collection...
Disabling 2 streams from UVB687
Disabling 3 streams from WMRP2147
libusb: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusb: warning [do_close] A cancellation for an in-flight transfer hasn't completed but closing the device handle
libusb: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusb: warning [do_close] A cancellation for an in-flight transfer hasn't completed but closing the device handle
libusb: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusb: warning [do_close] A cancellation for an in-flight transfer hasn't completed but closing the device handle
libusb: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusb: warning [do_close] A cancellation for an in-flight transfer hasn't completed but closing the device handle
libusb: error [do_close] Device handle closed while transfer was still being processed, but the device is still connected as far as we know
libusb: warning [do_close] A cancellation for an in-flight transfer hasn't completed but closing the device handle
I then get a segfault (the windows equivalent rather). Here's the exception:
Unhandled exception at 0x000007FEDD5A0988 (libusb-1.0.dll) in asphodel_streaming.exe: 0xC0000005: Access violation reading location 0x0000000000000040. occurred
Here' the backtrace:
libusb-1.0.dll!windows_handle_transfer_completion(usbi_transfer * itransfer) Line 763
at C:\Libraries\libusb\libusb\os\windows_common.c(763)
libusb-1.0.dll!handle_event_trigger(libusb_context * ctx) Line 2083
at C:\Libraries\libusb\libusb\io.c(2083)
libusb-1.0.dll!handle_events(libusb_context * ctx, timeval * tv) Line 2192
at C:\Libraries\libusb\libusb\io.c(2192)
libusb-1.0.dll!libusb_handle_events_timeout_completed(libusb_context * ctx, timeval * tv, int * completed) Line 2293
at C:\Libraries\libusb\libusb\io.c(2293)
Asphodel64.dll!usb_poll_device(AsphodelDevice_t * device, int milliseconds, int * completed) Line 2432
at c:\codebuild\tmp\output\src570898187\src\bitbucket.org\suprocktech\asphodel\src\asphodel_usb.c(2432)
I've seen similar warnings on older versions, but never a segfault. I understand this may be an artifact of my code, but is there an easy way for the library to guard against this?
Closing a device while there are outstanding transfers is 100% an application error. Handling this gracefully has been discussed in #540, #610 and #703.
I will close this and please refer to #703.
I will still keep this open since the issue keeps popping up.
We've experienced something similar on device disconnects. In particular, we get the error [winusbx_submit_bulk_transfer] ReadPipe/WritePipe failed: [22] The device does not recognize the command
message, which corresponds to Win32's ERROR_BAD_COMMAND
.
We're currently experimenting with this small patch, which just translates ERROR_BAD_COMMAND
(and a couple other errors, for good measure) into LIBUSB_ERROR_NO_DEVICE
inside the *do_bulk_transfer
functions: https://github.com/libusb/libusb/commit/9e0349e9cb1908c42d62a78358b601d656f2de56. Initial tests show that it solves the problem in our case, and there is precedent elsewhere in libusb for converting these specific Win32 errors into LIBUSB_ERROR_NO_DEVICE
.
We're happy to upstream this change, if you want it. Let me know...
I've got a device with 6 concurrent bulk transfers to the same endpoint. Sometimes when unplugging the device, libusb gets into a state where it cannot recover. On a fast computer (like my development PC), this happens maybe 10% of the time. On a slow computer it's closer to 100%. Tested on various systems and it definitely follows this trend.
Tested on Windows 7 64-bit and Windows XP 32-bit.
Seems to be an issue where the file descriptors and transfers get out of sync. Seems like a race condition to me.
Here's the beginning of the relevant log portion, where we can see a couple of successful transfers full of data. Everything is going great...
At this point libusb realizes the device is gone, and calls 4 of the 6 callbacks with a stall error code. The remaining 2 transfers time out, and libusb tries to call AbortPipe, which fails (because the device is gone?).
Repetitive parts of the log have been trimmed while I plug in the new device.
The new device is connected and then libusb gets into a state where it cannot recover. Complaining
could not find a matching transfer for fd 1
:This pattern repeats forever and the program never collects more data.