jordens / pyflycapture2

python bindings for the flycapture v2 api (libflycapture-2c)
37 stars 31 forks source link

Multithreading USB3 creates SIGSSEGV or Thread Lockup #20

Closed DPastl closed 7 years ago

DPastl commented 7 years ago

Hello,

I've been playing around with the flycapture library lately trying to get a multi threading application to work. So far the GIGE cameras work okay, but I don't seem to be able to get USB3 cameras to work using the exact same code:

from multiprocessing import Process
from time import sleep
import flycapture2 as fc2

class SimpleCam(Process):
    camdex = None

    def __init__(self, camdex):
        Process.__init__(self)
        self.camdex = camdex
        print str(self.camdex) + " Getting context"
        self.camera = fc2.Context()
        self.camera.connect(*self.camera.get_camera_from_index(self.camdex))
        print self.camera.get_camera_info()

    def run(self):
        print str(self.camdex) + " starting capture"
        self.camera.start_capture()
        image = fc2.Image()
        print str(self.camdex) + " retreiving image"
        self.camera.retrieve_buffer(image)
        print str(self.camdex) + " stopping capture"
        self.camera.stop_capture()

if __name__ == '__main__':
    sc0 = SimpleCam(1)
    sc0.start()
    sleep(10)
    sc0.join()

Here I'm using the camera index to select between GIGE and USB3 cameras connected to my system and confirming the selection from the camera info printout.

If I select a USB3 camera, I will get the following output:

1 Getting context
{'sensor_resolution': '2448x2048', 'firmware_build_time': 'Fri Nov 20 00:50:04 2015', 'sensor_info': 'Sony IMX264 (2/3" Color CMOS)', 'serial_number': 16401081, 'firmware_version': '1.10.3.0', 'vendor_name': 'Point Grey Research', 'model_name': 'Chameleon3 CM3-U3-50S5C'}
1 starting capture

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

What's interesting to me is that moving the context creation into the run section will allow the USB3 camera to work. But only one camera. The second you add another thread at the same time it will cause one or both threads to lock up.

I'm looking into the cause of the problem, but I thought I'd post this to see if anyone else has any ideas or suggestions. Flycapture2 is supposedly capable of multithreading so this should be possible.

jordens commented 7 years ago

This is multiprocessing and not multithreading. The child Process that executes run(), never got its own Context. That can't work. __init__() is executed still in the parent. Then run() is executed in the child after forking by start().

I am pretty sure that most of the flycapture2 API is not thread-safe. One context per thread/process and no data sharing should be fine though.

DPastl commented 7 years ago

That's very interesting and not at all the behavior I expected, although it does explain some issues I've been having. Following that line of thought then:

If I change the code to be as follows:

from multiprocessing import Process
from time import sleep
import flycapture2 as fc2

class SimpleCam(Process):
    camdex = None

    def __init__(self, camdex):
        Process.__init__(self)
        self.camdex = camdex

    def run(self):
        print str(self.camdex) + " Getting context"
        self.camera = fc2.Context()
        self.camera.connect(*self.camera.get_camera_from_index(self.camdex))
        print self.camera.get_camera_info()
        print str(self.camdex) + " starting capture"
        self.camera.start_capture()
        image = fc2.Image()
        print str(self.camdex) + " retreiving image"
        self.camera.retrieve_buffer(image)
        print str(self.camdex) + " stopping capture"
        self.camera.stop_capture()

if __name__ == '__main__':
    sc0 = SimpleCam(0)
    sc1 = SimpleCam(1)
    sc0.start()
    sc1.start()
    sleep(5)
    sc0.join()
    sc1.join()

I now get one process or the other getting stuck:

0 Getting context
1 Getting context
{'sensor_resolution': '2448x2048', 'firmware_build_time': 'Fri Nov 20 00:50:04 2015', 'sensor_info': 'Sony IMX264 (2/3" Color CMOS)', 'serial_number': 16401257, 'firmware_version': '1.10.3.0', 'vendor_name': 'Point Grey Research', 'model_name': 'Chameleon3 CM3-U3-50S5C'}
0 starting capture
0 retreiving image
0 stopping capture

But 1 never continues.

If I create a script that only accesses a single camera and then run it from two different terminals however (with different cameras for each of course) it runs just fine. It's almost as though the problem is that I'm not creating two instances of pyflycapture?

jordens commented 7 years ago

Does it hang in Context() or in connect()? Is this Windows or Linux?

But your code should work. I don't see anything in pyflycapture2 that would prevent it. My best guess is that flycapture2 is broken (or needs some trickery) and that equivalent code using the C API directly would hang in exactly the same way. Does the flycapture2 documentation say anything about multiple processes or threads?

jordens commented 7 years ago

Or does it hang in get_camera_info() or in get_camera_from_index()?

DPastl commented 7 years ago

It appears to be hanging at fc2.Context(), it doesn't get to "get_camera_info()".

From a couple of articles I've found on Point Grey's website it sounds as though it is multithread safe and that they actually recommend doing so.

I've looked through the flycapture2 documentation and couldn't find any references to multithreading/multiprocessing.

DPastl commented 7 years ago

I'll create a support request with PG to see if they have any ideas.

DPastl commented 7 years ago

I just wrote a bit of code to take the Flycapture2_test.cpp example and convert it to run two cameras concurrently in a threaded manner. It seems to work just fine.

flycapture_threadtest.zip

DPastl commented 7 years ago

So another interesting tidbit in my findings (Sorry @jordens for all the emails I'm likely generating) is that if I import dynamically, there's no problem at all:

from multiprocessing import Process
from time import sleep

class SimpleCam(Process):
    camdex = None

    def __init__(self, camdex):
        Process.__init__(self)
        self.camdex = camdex

    def run(self):
        import flycapture2 as fc2
        print str(self.camdex) + " Getting context"
        self.camera = fc2.Context()
        print str(self.camdex) + " Getting Camera"
        self.camera.connect(*self.camera.get_camera_from_index(self.camdex))
        print self.camera.get_camera_info()
        print str(self.camdex) + " starting capture"
        self.camera.start_capture()
        while(1):
            image = fc2.Image()
            print str(self.camdex) + " retreiving image"
            self.camera.retrieve_buffer(image)
            sleep(1)
        print str(self.camdex) + " stopping capture"
        self.camera.stop_capture()

if __name__ == '__main__':
    sc0 = SimpleCam(0)
    sleep(1)
    sc1 = SimpleCam(1)
    sc0.start()
    sc1.start()
    sleep(5)
    sc0.join()
    sc1.join()

I think I know what the problem is (one instance of pyflycapture being shared by two processes) but I don't have the slightest idea what the fix would be, outside of what I just did.

jordens commented 7 years ago
DPastl commented 7 years ago

Yes, multiprocessing and multithreading are different, but only in the python context. In cpp, multithreading is multiprocessing, provided there are multiple processors/cores to run on (from my understanding). This is one of the most annoying things I've learned about python, that there is a distinction between the two. So with my cpp example, it "should" be running both threads concurrently and not via time-slicing on a single thread/core. Just to be thorough, I did a test using Python's multithreading library, it has similar problems, but these aren't resolved using the import change. I'm not terribly interested in the Python multithreading library since it can only exist within a single GIL and thus a single CPU thread (and is very slow because of it).

I think perhaps my original code was sharing the parent's import of pyflycapture with each child rather than creating new imports in the GIL that the each child was spawned into. That's my best guess at least, I'm still not familiar with what's really going on under the hood with python. It's something I really need to learn.

jordens commented 7 years ago

Multiprocessing and multithreading map just fine to C++. You create child processes with fork().

While threads share (most) resources and can in principle directly access (and corrupt) resources created in sibbling threads, fork()ed sub-processes inherit copies of the parent's resources and (generally) do not share them.

jordens commented 7 years ago

The GIL and multithreading is not a problem for pyflycapture2. The methods all release the GIL before calling into flycapture2. That means you can concurrently do e.g. retrieve_buffer() in multiple threads without seeing any problems due to the GIL.

(Admittedly, I never rigorously verified that generously releasing the GIL in pyflycapture2 is fine. But I am reasonably convinced that it is.)

The GIL is only a problem if you intend to manipulate any Python datastructures from multiple threads simultaneously. You are not doing that if you are inside retrieve_buffer() or similar.

And with multiprocessing the GIL is generally irrelevant.

DPastl commented 7 years ago

Interesting, I was unaware of that. My education/experience is mostly in low-level stuff like C or assembly for embedded systems so I'm a bit rusty on the finer points of higher level programming languages and multithreading/multiprocessing. Might have to pick up a book on the topic!

Well, I admit that I'm stumped as to the actual cause of the problem then, but since there's a solution I'll close this issue. Thanks again for all the help and feedback!