sofa-framework / sofa

Real-time multi-physics simulation with an emphasis on medical simulation.
https://www.sofa-framework.org
GNU Lesser General Public License v2.1
871 stars 297 forks source link

[SofaPython] Add GIL management #326

Closed maxime-tournier closed 6 years ago

maxime-tournier commented 6 years ago

Hello all,

This PR adds (hopefully) proper GIL handling to SofaPython, so that it is possible to run python threads concurrently with the main Sofa program. It enables the following scene to run concurrently with the main program:

def createScene(node):

    from threading import Thread
    import time

    def target():
        while True:
            time.sleep(1)
            print("it's aliiive!")

    thread = Thread(target = target)
    thread.daemon = True
    thread.start()

    time.sleep(2)

This is achieved through the RAII class PythonEnvironment::gil, which is meant to protect the boundaries between c++ and python code as follows:

// some function called from the c++ side
{
   PythonEnvironment::gil lock;
   // safely call python functions here: the GIL 
   // is acquired and released automatically
}

The trick is that the GIL should also be released prior to the first lock so that the last object to unlock the GIL gives an opportunity to python threads to run, if any. Otherwise, the main thread still holds the GIL after the last lock is destroyed, and python threads never get to run.

This is a work in progress, but should be pretty harmless and easy to disable if something goes wrong.

EDIT: it does not work on this branch :-/


This PR:

Reviewers will merge only if all these checks are true.

ocarre commented 6 years ago

That's great !

An improvement would be to release the GIL also in the sofa binded functions : when a binding is called, it should release the GIL, do its potentially costly C++ stuff then acquire the GIL back to safely return in the python script. This way, we release even more the pressure on the GIL and we can do other things in another thread ... even loading another sofa python scene from another C++ thread.

maxime-tournier commented 6 years ago

Glad you like it :-)

But first I need to figure out why it works on our fork but not on this one though :-/

After that, it would be super sweet to add finer grained GIL release as you suggest, but since we also have the monstro-SofaPython-PR pending I anticipate many merge conflicts, so the plan was to keep this one as small as possible.

BTW do we have actual costly or I/O-bound c++ operations exposed in python? And if so, do these happen during simulation or only during scene initialization?

maxime-tournier commented 6 years ago

Ok now it works, but I also need to gil-protect any module initialization that does python stuff, otherwise we get the (righteous) following error:

Fatal Python error: PyThreadState_Get: no current thread

and a segfault ensues.

On a related note, there really should be no naked (as in "not wrapped in a proper PythonEnvironment method") python call apart from bindings methods.

maxime-tournier commented 6 years ago

As an added bonus, the following now works as expected:

import Sofa

class Script(Sofa.PythonScriptController):

    def __init__(self, node):
        self.foo = 0

    def onBeginAnimationStep(self, dt):
        self.foo += 1

def createScene(node):

    script = Script(node)

    from threading import Thread

    def target(arg):
        # you can now inspect script.foo in the console during runtime        
        import code
        code.interact(local = arg)

    thread = Thread(target = target, args = (locals(), ))
    thread.start()
ocarre commented 6 years ago

The only big GIL locking part i can see is the SML scene loading which mix file loading with costly parsing in python and sofa component creation and initialization in C++. In a future PR it will be very interesting to improve this, using python only as a glue and totally avoiding heavy processes.

maxime-tournier commented 6 years ago

I see. Maybe in that case we need another primitive to locally unlock the GIL instead of acquiring the GIL.

I created an example in which a TCP server listens in the background and broadcasts the position of a falling particle as the simulation progresses to multiple TCP clients, JSON-encoded on both sides, all in python. The total is below 200 LOC.

This is awesome.

ocarre commented 6 years ago

Yes, an unlock primitive would be great in that case.

matthieu-nesme commented 6 years ago

do not forgot PythonMainScriptController

maxime-tournier commented 6 years ago

@matthieu-nesme done, what does this class even do?

maxime-tournier commented 6 years ago

gentle bump.

damienmarchal commented 6 years ago

This PR seems to have problem on the tests.

maxime-tournier commented 6 years ago

@damienmarchal you're right, my bad, it does not fail on my build :-/

Looking into it, thanks for the feedback.

maxime-tournier commented 6 years ago

[ci build]

hugtalbot commented 6 years ago

do not forget to add proper labels to help others reviewing your work quickly

damienmarchal commented 6 years ago

[ci-builld]

bcarrez commented 6 years ago

[ci-build]

hugtalbot commented 6 years ago

flag ready, green builds, I merge it

damienmarchal commented 6 years ago

Who is next ? #329 (:))

hugtalbot commented 6 years ago

@damienmarchal : none is next since, at my best knowledge, none has a "ready" flag. Right ?

damienmarchal commented 6 years ago

329 was waiting for change from Guillaume which he did the same day. No one else apart him says a no-go so...it should be ready :)

But I just noticed that the GIL merged today introduce several regression...so I suggest to revert it first then...wait maxime investigates the problem...

maxime-tournier commented 6 years ago

Hi all, just got back, will look into it asap :-)