mpokorny / vysmaw

Fast visibility stream muncher
GNU General Public License v3.0
1 stars 2 forks source link

vysmaw_message_unref requires Python GIL? #37

Closed caseyjlaw closed 6 years ago

caseyjlaw commented 6 years ago

I am trying to reduce the use of the Python GIL as much as possible to improve stability in my job schedule. My vysmaw app can wrap some chunks of code in a "nogil" context, which helps. However, if I include the vysmaw_message_unref call in the nogil, cython complains during compilation: Calling gil-requiring function not allowed without gil.

My understanding is that this is a c function and does not require the GIL. Any ideas how I could use this without the GIL?

caseyjlaw commented 6 years ago

I think this is an issue related to proper use in cython. I found an stackoverflow thread that presents a similar problem: https://stackoverflow.com/questions/33368628/which-c-functions-are-not-gil-requiring-in-cython

I'll investigate further, but I don't think any action is needed on the vysmaw side.

caseyjlaw commented 6 years ago

Ok, I'm still a bit confused by this. It seems that cython thinks the vysmaw_message type object is a python object. However, this only creates a compilation problem when running the c function vysmaw_message_unref. All the other c functions for popping the message, pulling data out, etc, can be used without the Python GIL. It is not clear to me why Cython compilation would object only when unref function is called. Any thoughts welcome.

caseyjlaw commented 6 years ago

One more update... I can get around the GIL locking by overloading the vysmaw_message_unref and adding a nogil on the new function.

cdef extern from "vysmaw.h" nogil:
    void vysmaw_message_unref(vysmaw_message *arg)

Is this asking for trouble? If not, I suppose this should actually be a change pushed up to vysmaw itself.

mpokorny commented 6 years ago

Are those other C functions declared with nogil in vysmaw.pxd? See for example, https://github.com/mpokorny/vysmaw/blob/54cd25d1bd04851519cb35764618b07d33683003/py/vysmaw.pxd#L169. I don't remember why I did things that way, but I'll have a look and see whether there is a good reason for it or I just forgot to declare vysmaw_message_unref with nogil.

caseyjlaw commented 6 years ago

I suppose the message pop is the main one. I am also reading properties/spectrum from msg.

mpokorny commented 6 years ago

I think that your solution is safe. I could easily add the nogil annotation to the vysmaw_message_unref declaration, which would amount to the same thing, I think. However, it's interesting that I previously added nogil to some of the declarations in vysmaw.pxd. I'm not sure why that would be necessary, so if it's OK with you, I'm going to hold off on making vysmaw source code changes just yet.

caseyjlaw commented 6 years ago

No problem. Our commissioning plan includes a few tests of valid data in commensal mode. If you can think of other tests that would reveal an issue with nogil here, let me know.

mpokorny commented 6 years ago

I went ahead and added the nogil annotation to all the functions exported from libvysmaw and libvys to cython in the vysmaw.pxd file. I believe that should be safe, since neither libvys nor libvysmaw make calls to Python. The consumer callback function is the only path through which the Python interpreter could be re-entered, but that does not happen through any of the libvys/libvysmaw functions. You'll have to hold off a bit before trying out these changes, as I've apparently broken the build...sorry.

caseyjlaw commented 6 years ago

Great to hear (about the nogil, not the broken build).

mpokorny commented 6 years ago

The build is fixed, and I've installed the latest on the CBE cluster. Try it out, without the workaround you've implemented, and please check that the GIL is no longer being locked as before. I'll close this issue after you've validated the fix.

caseyjlaw commented 6 years ago

I am trying to build the shared object library now for my cython app. It claims that the vysmaw_message_unref function still requires the GIL. I suspect I am linking against the wrong vysmaw libraries. I use an include directory of /opt/cbe-local/include and library directories of /opt/cbe-local/lib/python3.6/site-packages/vysmaw and /opt/cbe-local/lib. Does that look right?

mpokorny commented 6 years ago

Those are the correct directories. You should see the function declarations with nogil in /opt/cbe-local/lib/python3.6/site-packages/vysmaw/vysmaw.pxd; can you confirm that? How are you determining that you the call still requires the GIL? I'm looking at the generated c code from sample3, and I'm not sure what to look for.

caseyjlaw commented 6 years ago

I am inferring this from the cython build error, which I've integrated with Python setuptools build script:

(development3) cbe-node-01$ python setup.py install
Compiling filtern.pyx because it depends on /opt/cbe-local/lib/python3.6/site-packages/vysmaw/vysmaw.pxd.
Compiling timefilter.pyx because it depends on /opt/cbe-local/lib/python3.6/site-packages/vysmaw/vysmaw.pxd.
Compiling vysmaw_reader.pyx because it changed.
[1/3] Cythonizing vysmaw_reader.pyx

Error compiling Cython file:
------------------------------------------------------------
...
#                                print('ant1, ant2, antlist input: {0} {1} {2}'.format(info.stations[0], info.stations[1], self.antlist))

                    else:
                        printf('Unexpected message type: %u', msg[0].typ)

                    vysmaw_message_unref(msg)
                                       ^
------------------------------------------------------------

vysmaw_reader.pyx:308:40: Calling gil-requiring function not allowed without gil

It is clearly looking in the right location, but cython does not seem to think that vysmaw_message_unref is a nogil function. If I look at the pxd file directly, I see the nogil on that line. If you don't see anything obvious, then it may not be worth chasing down. I can just redefine the unref with nogil in my vysmaw application.

caseyjlaw commented 6 years ago

I fixed this locally in my cython by:

cdef extern from "vysmaw.h" nogil:
    void vysmaw_message_unref(vysmaw_message *arg)

But I think you also fixed, it so I'll close this issue.