vectorizing functions with multiple return values?

mreineck commented 7 years ago

My goal is to provide a vectorized Python binding to C++ functions that return more than a single value, like

void ang2vec (double theta, double phi, double &x, double &y, double &z)
  {
  x=sin(theta)*cos(phi);
  y=sin(theta)*sin(phi);
  z=cos(theta);
  }

py::vectorize(ang2vec) does not work, producing error messages like "pybind11/numpy.h:732:5: error: forming pointer to reference type ‘double&’".

I tried to translate this into a more Python-like interface:

tuple<double,double,double> ang2vec2 (double theta, double phi)
  {
  return tuple<double,double,double>(sin(theta)*cos(phi),sin(theta)*sin(phi),cos(theta));
  }

but unfortunately vectorizing this fails as well.

Are there fundamental obstacles to supporting this scenario, or am I just the first one trying to do this? :)

mgeier commented 6 years ago

I also would like to do this!

I think it would make most sense to return a py::array from the function to be vectorized.

I tried this with:

py::array_t<double> return_array(double t) {
  py::array_t<double> a({2});
  a.mutable_at(0) = a.mutable_at(1) = t;
  return a;
}

... and:

m.def("return_array", py::vectorize(&return_array));

... but this created a very long compiler error, boiling down to couldn't deduce template parameter ‘T’ and pointing to this code:

https://github.com/pybind/pybind11/blob/2d0507db43cd5a117f7843e053b17dffca114107/include/pybind11/numpy.h#L1497-L1502

So it looks like this isn't supported (but it would be great if it would be supported!).

I tried if this works in NumPy ...

>>> import numpy as np
>>> def myfunc(x):
...     return np.array([x, x])
... 
>>> v = np.vectorize(myfunc)
>>> v([2, 3])
Traceback (most recent call last):
...
ValueError: setting an array element with a sequence.

... and it turns out that it doesn't!

But then I had a look at the vectorize docs and found out that there is a quite new (version 1.12, released about a year ago) argument called signature (introduced in https://github.com/numpy/numpy/pull/8054) that allows me to do what I want:

>>> v = np.vectorize(myfunc, signature='()->(n)')
>>> v([2, 3])
array([[2, 2],
       [3, 3]])

To implement this in pybind11, it wouldn't even be necessary to add a separate argument, because the compiler would know that the return type of the function is a py::array.

Sadly, I don't understand enough about the internals of pybind11 to tackle an implementation myself.

It seems that this is part of a grander scheme called Generalized Universal Functions. I think it would already be great to just support py::arrays in return values, but this could be extended to input arguments, too.

This may or may not be related to this comment by @JohanMabille.

BTW, this is related to a more complicated problem I asked about on stackoverflow: https://stackoverflow.com/q/48736838/.

mgeier commented 6 years ago

I just realized that there is another way to handle multiple outputs in numpy.vectorize():

>>> import numpy as np
>>> def return_multiple_values(x):
...     return x, x + x, x * x
... 
>>> v = np.vectorize(return_multiple_values)
>>> v(1)
(array(1), array(2), array(1))
>>> v([2, 3])
(array([2, 3]), array([4, 6]), array([4, 9]))

This could probably also be implemented in pybind11, using std::tuple, but for my use case this wouldn't be applicable.

pybind / pybind11

vectorizing functions with multiple return values? #763