espressomd / espresso

The ESPResSo package
https://espressomd.org
GNU General Public License v3.0
229 stars 185 forks source link

On releasing the GIL in ESPResSo #4712

Closed jngrad closed 1 year ago

jngrad commented 1 year ago

Here is the definition of the global interpreter lock (GIL) in the Cython documentation (link):

A lock inside the Python interpreter to ensure that only one Python thread is run at once. This lock is purely to ensure that race conditions do not corrupt internal Python state. Python objects cannot be manipulated unless the GIL is held. It is most relevant to Cython when writing code that should be run in parallel. If you are not aiming to write parallel code then there is usually no benefit to releasing the GIL in Cython.

There is one place in the ESPResSo code where the GIL is released: the integrator.run() method, if the integrator is not steepest descent. While in principle we could release the GIL everywhere by making the script interface call_method() run within a nogil context, this doesn't lead to performance improvements (tested with the testsuite on a 16-core AMD Ryzen). It is however useful for integrators, since we often run them concurrently with a python thread to update the OpenGL visualizer scene in real time; in that case, not releasing the GIL leads to a serious drop in frames per second.

For historical reasons, the integrator script interface classes had to rely on custom and poorly-documented Cython code to call the C++ method with nogil while preserving both C++ exceptions and interrupt signals. Now that the development branch of ESPResSo requires Python 3.9, we can replace the Cython signal handling code by regular Python code. In addition, we now require a Cython version that no longer requires nogil functions to be noexcept, which means we can use the call_method() instead of re-implementing it in integrate.pyx. We just need to keep in mind that except * is not recommended due to its prohibitive cost (the GIL is re-acquired for each statement in the nogil context), while all other exception specifiers come without overhead. The body of a nogil context cannot contain Python objects, because implicit coercion of Python types to C++ types requires calls to Cython functions that can throw Python exceptions, which is unsafe without the GIL (potential for race conditions).

jngrad commented 1 year ago

For the curious, here is a quick dive into Cython code generation. All macros were expanded to help with readability. Most functions are documented in Python C API: Initialization, Finalization, and Threads.

Here is the code to call a cdef Cython function:

/* "espressomd/script_interface.pyx":192
 *         value = ptr.call_method(name, params)
 *         res = variant_to_python_object(value)             # <<<<<<<<<<<<<<
 *         if handle_errors_message is None:
 */
  PyObject *__pyx_v_res = NULL;
  try {
    __pyx_v_res = __pyx_f_10espressomd_16script_interface_variant_to_python_object(__pyx_v_value);
    if (unlikely(!__pyx_v_res)) { /* check for nullptr */
      __PYX_MARK_ERR_POS(0, 192) /* generate traceback with filename and line information */
      goto __pyx_L1_error; /* jump to error exit routine */
    }
  } catch(...) {
    __Pyx_CppExn2PyErr(); /* convert C++ exception to Python exception */
    __PYX_MARK_ERR_POS(0, 192) /* generate traceback with filename and line information */
    goto __pyx_L1_error; /* jump to error exit routine */
  }
  __Pyx_GOTREF(__pyx_v_res);

Here is the code to call a C++ function:

/* "espressomd/script_interface.pyx":191
 *         value = ptr.call_method(name, params)             # <<<<<<<<<<<<<<
 *         res = variant_to_python_object(value)
 *         if handle_errors_message is None:
 */
  ScriptInterface::Variant __pyx_v_value;
  try {
    __pyx_v_value = __pyx_v_ptr->call_method(__pyx_v_name, __pyx_v_params);
  } catch(...) {
    __Pyx_CppExn2PyErr(); /* convert C++ exception to Python exception */
    __PYX_MARK_ERR_POS(0, 191) /* generate traceback with filename and line information */
    goto __pyx_L1_error; /* jump to error exit routine */
  }

Inside a nogil context, a call to a C++ function becomes more sophisticated. For the GIL must be released, then the C++ function is called, and finally the GIL is re-acquired. If the C++ function throws, the GIL is re-acquired to safely handle it.

/* "espressomd/script_interface.pyx":
 *         with nogil:                                       # <<<<<<<<<<<<<<
 *             value = ptr.call_method_nogil(name, params)
 *         res = variant_to_python_object(value)
 *         if handle_errors_message is None:
 */
  #ifdef WITH_THREAD
  PyThreadState *_save = PyEval_SaveThread(); /* release the GIL */
  #endif

/* "espressomd/script_interface.pyx":191
 *         with nogil:
 *             value = ptr.call_method_nogil(name, params)   # <<<<<<<<<<<<<<
 *         res = variant_to_python_object(value)
 *         if handle_errors_message is None:
 */
  ScriptInterface::Variant __pyx_v_value;
  try {
    __pyx_v_value = __pyx_v_handle->call_method(__pyx_v_method_name_char, __pyx_v_parameters);
  } catch(...) {
    /* gracefully handle C++ exceptions */
    #ifdef WITH_THREAD
    PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure(); /* re-acquire the GIL */
    #endif
    __Pyx_CppExn2PyErr(); /* convert C++ exception to Python exception */
    #ifdef WITH_THREAD
    PyGILState_Release(__pyx_gilstate_save); /* release the GIL */
    #endif
    __PYX_MARK_ERR_POS(0, 191) /* generate traceback with filename and line information */
    goto __pyx_L6_error; /* jump to context manager error exit */
  }

/* "espressomd/script_interface.pyx":
 *         with nogil:                                       # <<<<<<<<<<<<<<
 *             value = ptr.call_method_nogil(name, params)
 *         res = variant_to_python_object(value)
 *         if handle_errors_message is None:
 */
  /* normal exit */{
    #ifdef WITH_THREAD
    PyEval_RestoreThread(_save); /* re-acquire the GIL */
    #endif
    goto __pyx_L7; /* jump to context manager normal exit */
  }
  __pyx_L6_error: { /* context manager error exit */
    #ifdef WITH_THREAD
    PyEval_RestoreThread(_save); /* re-acquire the GIL */
    #endif
    goto __pyx_L1_error; /* jump to error exit routine */
  }
  __pyx_L7:; /* context manager normal exit */