Closed niermann999 closed 5 months ago
This is still VERY WIP
Since this PR is already huge, the SoA layout for std::array
will be done in a separate PR
OK, I added a bit more of an explanation :)
Are the issues on macOS really insurmountable? I really don't like how that code is disabled right now.
Me neither, but I am not sure how to debug it. Basically, as soon as I call e.g. Vc::atan2
the release build on xcode throws Illegal
while running the tests. The debug test is fine though
P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)
Note that unrelated to the templating issues I also had to make these changes:
diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
index 4000dda..2ddd055 100644
--- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
+++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
@@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> {
const std::size_t n_warmup{this->m_cfg.n_warmup()};
// Spin down before benchmark (Thread zero is counting the clock)
- if (state.thread_index() == 0 and this->m_cfg.do_sleep()) {
+ if (state.thread_index() == 0 && this->m_cfg.do_sleep()) {
std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
}
@@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> {
const std::size_t n_warmup{this->m_cfg.n_warmup()};
// Spin down before benchmark (Thread zero is counting the clock)
- if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) {
+ if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) {
std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
}
MSVC just does not like the and
keyword. 😦
Note that unrelated to the templating issues I also had to make these changes:
diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp index 4000dda..2ddd055 100644 --- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp +++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp @@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> { const std::size_t n_warmup{this->m_cfg.n_warmup()}; // Spin down before benchmark (Thread zero is counting the clock) - if (state.thread_index() == 0 and this->m_cfg.do_sleep()) { + if (state.thread_index() == 0 && this->m_cfg.do_sleep()) { std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep())); } @@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> { const std::size_t n_warmup{this->m_cfg.n_warmup()}; // Spin down before benchmark (Thread zero is counting the clock) - if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) { + if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) { std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep())); }
MSVC just does not like the
and
keyword. 😦
Yes, I realized :) It is already gone from the PR #97 , but I have to put it here as well
P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)
Btw, what were those warnings about? I started seeing some in detray, too, although it had previously stopped after we rolled back to version 34.0 (and now started again)
This adds a new vector wrapper class that allows for an SoA layout, in particular, to enable us to use the
Vc::Vector
type efficiently. The newstorage/vector
keeps an array-like data structure, that holds the vector elements (say,x
,y
,z
). Both the array type, as well as the value type of the elements are templated (storage/vector< algebraic vector dim, aos/soa value type (scalar vs e.g SIMD vector), array-like storage for the algebraic vector elements >
), so that combinations like these are possible:std::array
based vector (not necessary, since this would essentially duplicate thearray
plugin, while begin more complicated):storage::vector<3, scalar_t, std::array>
vc_vc
plugin, while making the extravc_array4
wrapper superfluous):storage::vector<3, scalar_t, Vc::SimdArray>
std::array
based:storage::vector<3, std::array<scalar_t, N>, std::array>
storage::vector<3, Vc::Vector<scalar_t>, std::array>
Also adds benchmarks to compare to the
std::array
andEigen
AoS plugins. The benchmarks contain a base class that holds the basic configuration (e.g. number of warmup samples) and a derived vector benchmark class that holds the samples of random inititialized vectors and is templated on the kind of vector operation to be benchmarked (unary and binary are possible). The benchmarks for thegetter
namespace also make use of the vector benchmark class. The benchmarks for single and double precision are then instantiated per plugin, including a prescription on how to produce a random vector.