I have a thin abstraction layer in my library which allows me to use different math backends (Xtensor, Eigen, Armadillo, etc.). This depends on being able to return views/maps from raw pointers. As per the documentation, the following function maps a C-style 1D array to a tensor:
template<typename T, std::size_t S>
inline auto Map(T* res) {
auto a = xt::adapt(res, S, xt::no_ownership(), std::array{S});
return a;
}
This is later used like this:
template<typename T, std::size_t S>
auto Add(T* res, auto const*... args) {
Map<T, S>(res) = (Map<T const, S>(args) + ...);
}
However, the problem is that this code is ~2.5 times slower than Eigen and my suspicion is that the data is actually being copied.
I've also investigated other causes like a lack of optimizations but:
the code is compiled with -O3 -march=x86-64-v3 (which includes avx2)
xsimd is installed and XTENSOR_USE_XSIMD is defined
Any help fixing the performance issue would be greatly appreciated, thanks.
You can try to remove useless assignment in
Map, just return xt::adapt(…) directly, that returns a xexpression and easy for compiler to apply RVO. NRVO sometimes can’t be applied.
Hi,
I believe my issue is similar to https://github.com/xtensor-stack/xtensor/issues/600, but that issue is 7 years old and the solution no longer applies.
I have a thin abstraction layer in my library which allows me to use different math backends (Xtensor, Eigen, Armadillo, etc.). This depends on being able to return views/maps from raw pointers. As per the documentation, the following function maps a C-style 1D array to a tensor:
This is later used like this:
However, the problem is that this code is ~2.5 times slower than Eigen and my suspicion is that the data is actually being copied.
I've also investigated other causes like a lack of optimizations but:
-O3 -march=x86-64-v3
(which includes avx2)xsimd
is installed andXTENSOR_USE_XSIMD
is definedAny help fixing the performance issue would be greatly appreciated, thanks.