inspirit / PS3EYEDriver

PS3EYE Camera Driver for OSX and Windows
Other
316 stars 92 forks source link

Fixing a crash due to calling libusb_control_transfer in a different thread #29

Closed HipsterSloth closed 7 years ago

HipsterSloth commented 8 years ago

It turns out that there is a bug in libusb's windows usage of the transfer request queue that isn't multithreading safe. Specifically in windows_handle_events. It's getting a pointer to a transfer state out of the transfer request queue while the list is locked, but then reading from that pointer after the lock is release. That means if another thread comes along and writes another transfer request it could stomp the memory that that transfer occupies.

And in fact this happens when you set the exposure while the video stream is running. The video transfer thread is issuing bulk transfer requests when the main thread can come along and inject a control transfer request (i.e setting the exposure or some other camera property). We didn't notice this before in psmoveapi because we set the exposure and then start the video transfer.

Unfortunately I can't see an easy way to fix this in libusb (it passes the transfer state to a callback on completion of the transfer). Therefore the safest thing for now is to fix this is PS3EYEDriver by not submitting control transfer requests in a different thread from the video frame polling thread (where the libusb event polling happens).

This fix creates a wrapper for libusb_control_transfer that will:

The new USBMgr::usbControlTransfer() is basically identical to libusb's libusb_control_transfer() except that it submits the transfer to the video polling thread to be processed rather than doing the event polling and blocking in the main thread.

I have tested this change against both the latest version of libusb as well as the 1.0.9 release on Win10. It appears to fix the crash I was seeing when running a video feed and spamming changes to the exposure.

@rovarma I think this crash came about with the refactor of bulk transfer into it's own thread. Would you mind pulling down this change and seeing if it causes any issues for you. The fix is isolated to ps3eye.cpp and should be pretty easy to drop in.

rovarma commented 8 years ago

Thanks for the fix.

I did consider this scenario when I was making the threading changes, but came to the conclusion that it Should Just Work (tm), since everything in the libusb docs points to it being completely safe to issue requests fron multiple threads. Clearly I was wrong however; is it worth opening a bug for this on libusb's tracker?

I'm heading out of town for a few days this evening, but I'll try to have a look at this before then.

inspirit commented 8 years ago

uhm, this looks nasty :) i wonder if this should be inside the library or its better to move it out as it is only related to windows usb bug...

HipsterSloth commented 8 years ago

@rovarma Yeah the multithreading notes in libusb seem to suggest that the polling thread has to be careful about using the right api when you have synchronous calls in the main thread:

From the libusb multithreading notes

You may decide that it is OK for your multi-threaded application to ignore some of the rules and locks detailed above, because you don't think that two threads can ever be polling the descriptors at the same time. If that is the case, then that's good news for you because you don't have to worry. But be careful here; remember that the synchronous I/O functions do event handling internally. If you have one thread doing event handling in a loop (without implementing the rules and locking semantics documented above) and another trying to send a synchronous USB transfer, you will end up with two threads monitoring the same descriptors, and the above-described undesirable behaviour occuring. The solution is for your polling thread to play by the rules; the synchronous I/O functions do so, and this will result in them getting along in perfect harmony.

The video polling thread is using libusb_handle_events_timeout_completed whose implementation is suppose to follow the multithreading safe event polling method outlined in that doc. So it sure looks like we're trying to do the right thing.

@inspirit Yeah this is not ideal. This was my third attempt at fixing the issue. The previous two attempts involved trying to pause the video capture thread while the control transfer was happening, but both attempts I made resulted in dead locks. This was the safest solution I could come up with. I don't know how you could move this fix out of the ps3eyedriver short of fixing the issue in libusb. It would be pretty easy to wrap this fix in a set of win32 ifdefs though so that it only affects windows. I'll file an issue with libusb, but I don't know how long it will take for it to get fixed and this issue was blocking the PSMoveService work cboulay and I were working on now.

HipsterSloth commented 8 years ago

@inspirit @rovarma @cboulay Submitted libusb issue

rovarma commented 8 years ago

I was only able to have a quick look before flying out, but it seemed to work fine for me in regular psmoveapi usage. I didn't have time to actually test the setting of exposure during video capture.

rovarma commented 8 years ago

Ah, reading the issue you opened I see that a fix for this was already submitted in libusb a year ago. Which newer version did you test it with?

HipsterSloth commented 8 years ago

@rovarma Thanks for giving that a quick test. I tested the issue in the most recent version of master (LIBUSB_NANO 11108). The fix diabolo38 mentioned was in LIBUSB_NANO version 10982, well before the one I tested with.

HipsterSloth commented 7 years ago

I had forgotten I had opened this pull request an eternity ago. This crash hasn't come up since Bayer mode was added. Probably just best to close this PR at this point.