Closed iamthebot closed 6 years ago
Looks like attempting to make an xt::pyarray<std::string>
yields the following error:
error: 'index' is not a member of 'pybind11::detail::is_fmt_numeric<std::__cxx11::basic_string<char>, void>'
static constexpr int type_num = value_list[pybind11::detail::is_fmt_numeric<value_type>::index];
Numpy arrays of strings are an interesting piece. In fact, they store all the strings of the array in a single buffer. Each string is padded to match the length of the longuest.
Maybe we should tackle this using xtl's stack allocated strings. Although, while numpy strings are null terminated, I am not sure they leave space to store the size...
I think wrapping the buffer, and mapping the contents to a std::string_view
could be a good approach, and has the correct semantics. Also there is a constructor from a char* which finds the null-termination automatically.
I think, the syntax to create these arrays would look a bit more like xt::pyarray<char[20]>
, which, I think, is also the syntax pybind11 supports. This would however also imply, that you need to know about the maximum string length at compile time (and during execution, it's probably advisable to create the array with a concrete dtype, e.g. <U20
for a 20 character string).
Merged and released!
How would one go about taking in (and returning) a numpy array of strings using xtensor-python (assuming ASCII)?
The use case is I have a numpy array containing a bunch of Base64 encoded JPEG images. I want to decode this batch using an OpenMP loop in C++. Ideally I should also be able to return a numpy array of strings.
I know I can work around this by creating a 2D numpy array of bytes (each row contains the ASCII string's bytes) but the problem is that it requires two passes since we have to find the max string length. Not to mention string conversions in python.