Closed forrestv closed 5 years ago
Thanks for this excellent report. The investigation must have been painful, I am very sorry about this.
I pushed a tentative fix to the new (and temporary) keep_callback_alive branch. Would you mind giving it a try, and reporting whether it fixes this issue ?
This patch is not as clean as I would like it to be, I may be missing something.
(moved below comment from commit afcc79f1f3
to this issue)
When I ran it with the branch at first, I got this error:
Traceback (most recent call last):
File "_ctypes/callbacks.c", line 315, in 'calling callback function'
File "/home/forrest/sonar/software/usb1/__init__.py", line 408, in __callbackWrapper
self.close()
File "/home/forrest/sonar/software/usb1/__init__.py", line 353, in close
if self.isSubmitted():
File "/home/forrest/sonar/software/usb1/__init__.py", line 799, in isSubmitted
return addressof(self.__transfer.contents) in self.__submitted_dict
AttributeError: 'NoneType' object has no attribute 'contents'
That was easily fixed with this diff:
diff --git a/usb1/__init__.py b/usb1/__init__.py
index aec6627..1133403 100644
--- a/usb1/__init__.py
+++ b/usb1/__init__.py
@@ -796,7 +796,7 @@ class USBTransfer(object):
Tells if this transfer is submitted and still pending.
"""
# Relies on None never being in __submitted_dict.
- return addressof(self.__transfer.contents) in self.__submitted_dict
+ return self.__transfer is not None and addressof(self.__transfer.contents) in self.__submitted_dict
def submit(self):
"""
With that change, it runs. The segfault previously took an average of around 12 hours to happen; I'll run it for a few days and see if it happens.
From reading the diff, I agree that it should fix the problem.
My code that previously segfaulted after about 12 hours on average has been running for a few days without problems. I think that that's good evidence that the problem is fixed.
When I ran it with the branch at first, I got this error:
Good catch, and the comment above that line is also a mistake (which shows that I initially did consider None
and then forgot about it): that line was originally self.__transfer in self.__submitted_dict
, except transfer
is not a hashable type.
My code that previously segfaulted after about 12 hours on average has been running for a few days without problems. I think that that's good evidence that the problem is fixed.
Fantastic !
I pushed it to master and released 1.7.1 with it.
Thank you for the very efficient debugging & bug reporting, and sorry for my slow responses.
During the entirety of the time that libusb is calling the callback that python-libusb1 provides, the thunk (stored in
USBTransfer.__ctypesCallbackWrapper
) needs to be kept alive. CallingUSBTransfer.close
within the callback destroys the (still-running) thunk, causing occasional segfaults in libffi when returning back to libusb.Currently, the
__submitted
flag andbefore_submit
/after_completion
callbacks are used to keep theUSBTransfer
alive, and that's almost sufficient. The__ctypesCallbackWrapper
is special, though; it needs to be guaranteed to live a little bit longer - until after__callbackWrapper
has returned.I think that the easiest fix here is to:
__callbackWrapper
a@staticmethod
__ctypesCallbackWrapper
in the class body rather than in__init__
so it's shared between allUSBTransfer
sUSBTransfer
object corresponding to the libusb transfer pointer that is provided to the callback