pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.66k stars 2.1k forks source link

Bus error when importing numpy from a separate thread. #4240

Open RealLast opened 2 years ago

RealLast commented 2 years ago

Required prerequisites

Problem description

I have a very simple program in which I want to embed python code. For this, I use py::initialize_interpreter() to start the interpreter. Afterwards, I directly want to import numpy. Consider the following code:

#include <iostream>
#include <thread>

#include <pybind11/embed.h>
namespace py = pybind11;

void test()
{
    py::initialize_interpreter(); // if I understand correctly, the GIL is still locked by now, thus we can safely call python.
    py::module_* mod = new py::module_(py::module_::import("numpy"));
    while(true); // block the function
}

This works well, if I call this function from the main thread:

int main()
{
    test();
    while(true);
}

However, if I call the same function from a different thread, it does not work. The following yields a bus error, when importing numpy.

int main()
{
    std::thread testThread(&test);
    while(true);
}

I am building the application using cmake:

set (CMAKE_CXX_STANDARD 14)
add_subdirectory(pybind11)
add_executable(test main.cpp)
include_directories(${CMAKE_CURRENT_LIST_DIR}/pybind11/include)
target_link_libraries(test PRIVATE pybind11::embed)

I understand, that using Python functions from different thread is not trivial, and that you need to manage the GIL properly. However, this is a very simple test case. All I want to do is to simply start and use the Python interpreter in a separate thread. No other thread is using the interpreter at the same time. The GIL should still be locked after py::initialize_interpreter() (adding a py::gil_scoped_acquire after py::inizialize_interpreter does not solve the issue). Thus, the only thing I can imagine is that the python interpreter needs to run in the main thread. But if so, why is this the case? And also, this only happens when importing numpy. It does not occure when importing "sys" or "os" for example.

Thanks and best regards

Reproducible example code

// Consider the following code.
// We add a function, that initializes the interpreter and imports numpy.
// If we call this function from the original thread in main, it works.
// If we call it from a separate thread, it does not work.

#include <iostream>
#include <thread>

#include <pybind11/embed.h>
namespace py = pybind11;

void test()
{
    py::initialize_interpreter();

    py::module_* mod = new py::module_(py::module_::import("numpy"));
    printf("done\n");
    while(true);
}

// The following works
int main()
{
    test();
    while(true);
}

// This does not work (bus error).
int main()
{
    std::thread testThread(&test);
    while(true);
}
EthanSteinberg commented 2 years ago

I can't reproduce this bug locally using gcc 9.4 and Python 3.10.6 on Ubuntu 20

Can you tell us more about your environment where you have the bug? What compiler, OS, and Python version are you having the bug with?

RealLast commented 1 year ago

Hello,

thank you very much for your response.

I am using "apple clang 13.1.6" as compiler. Python version is Python 3.9.9. The OS is Mac OS 12.6 Monterey and I am working on an Apple Macbook 2021 M1 (ARM).

Best regards