pybind / pybind11

Seamless operability between C++11 and Python
https://pybind11.readthedocs.io/
Other
15.64k stars 2.1k forks source link

Best way to get a view of a column of a numpy array? #2211

Open xin-jin opened 4 years ago

xin-jin commented 4 years ago

Suppose I have a function that operates on a 1d array inplace

void a(py::array_t<double> x) { 
  auto a_p = a.mutable_unchecked<1>();
  // do something
}

I now have a 2d array and I would like to implement a function (say, to do multi-threading) that calls a to operate on the 2d array columnwise. What is the correct way to do this?

void b(py::array_t<double> x_2d, int num_threads) {
  boost::asio::thread_pool pool(num_threads);
  for (int i = 0; i < x_2d.shape(1); ++i) {
    boost::asio::post(pool, [&, i] {
      // get x as the ith column of x_2d, how to do it here?
      a(x);
    });
  }
  pool.join();
}

I did something like

x = py::array_t<double>(
  {x_2d.shape(0)}, {x_2d.strides(0)}, x_2d.data(0, i), py::str()
);

But I sometimes got segmentation fault when num_threads is large, not sure why.

YannickJadoul commented 4 years ago

Why do you need a view? Can't you loop over each column by index, in each thread? Or have this function not take a Python object, but just a pointer to the data, or so? (I don't know how py::array will interact with threads.) And what's the state of the GIL? Did you release it?

If things crash, have you managed to get a stack trace with gdb?