feat: SoA layout for vector

niermann999 commented 1 year ago

This adds a new vector wrapper class that allows for an SoA layout, in particular, to enable us to use the Vc::Vector type efficiently. The new storage/vector keeps an array-like data structure, that holds the vector elements (say, x, y, z). Both the array type, as well as the value type of the elements are templated (storage/vector< algebraic vector dim, aos/soa value type (scalar vs e.g SIMD vector), array-like storage for the algebraic vector elements >), so that combinations like these are possible:

3-dim AoS std::array based vector (not necessary, since this would essentially duplicate the array plugin, while begin more complicated): storage::vector<3, scalar_t, std::array>
3-dim AoS vertical vectorized (Not there, yet, but would re-implement the current vc_vc plugin, while making the extra vc_array4 wrapper superfluous): storage::vector<3, scalar_t, Vc::SimdArray>
3-dim SoA of size N, std::array based: storage::vector<3, std::array<scalar_t, N>, std::array>
3-dim vetorized SoA (in this PR): storage::vector<3, Vc::Vector<scalar_t>, std::array>

Also adds benchmarks to compare to the std::array and Eigen AoS plugins. The benchmarks contain a base class that holds the basic configuration (e.g. number of warmup samples) and a derived vector benchmark class that holds the samples of random inititialized vectors and is templated on the kind of vector operation to be benchmarked (unary and binary are possible). The benchmarks for the getter namespace also make use of the vector benchmark class. The benchmarks for single and double precision are then instantiated per plugin, including a prescription on how to produce a random vector.

niermann999 commented 1 year ago

~~This is still VERY WIP~~

niermann999 commented 1 year ago

Since this PR is already huge, the SoA layout for std::array will be done in a separate PR

niermann999 commented 1 year ago

OK, I added a bit more of an explanation :)

niermann999 commented 1 year ago

Are the issues on macOS really insurmountable? I really don't like how that code is disabled right now.

Me neither, but I am not sure how to debug it. Basically, as soon as I call e.g. Vc::atan2 the release build on xcode throws Illegal while running the tests. The debug test is fine though

krasznaa commented 11 months ago

P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)

krasznaa commented 11 months ago

Note that unrelated to the templating issues I also had to make these changes:

diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
index 4000dda..2ddd055 100644
--- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
+++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
@@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if (state.thread_index() == 0 and this->m_cfg.do_sleep()) {
+    if (state.thread_index() == 0 && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

@@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) {
+    if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

MSVC just does not like the and keyword. 😦

niermann999 commented 11 months ago

Note that unrelated to the templating issues I also had to make these changes:

diff --git a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
index 4000dda..2ddd055 100644
--- a/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
+++ b/benchmarks/common/include/benchmark/common/benchmark_vector.hpp
@@ -71,7 +71,7 @@ struct vector_unaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if (state.thread_index() == 0 and this->m_cfg.do_sleep()) {
+    if (state.thread_index() == 0 && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

@@ -113,7 +113,7 @@ struct vector_binaryOP_bm : public vector_bm<vector_t<scalar_t>> {
     const std::size_t n_warmup{this->m_cfg.n_warmup()};

     // Spin down before benchmark (Thread zero is counting the clock)
-    if ((state.thread_index() == 0) and this->m_cfg.do_sleep()) {
+    if ((state.thread_index() == 0) && this->m_cfg.do_sleep()) {
       std::this_thread::sleep_for(std::chrono::seconds(this->m_cfg.n_sleep()));
     }

MSVC just does not like the and keyword. 😦

Yes, I realized :) It is already gone from the PR #97 , but I have to put it here as well

niermann999 commented 11 months ago

P.S. I saw some "new" Eigen warnings from the CUDA build in my tests. Something to follow up on separately at one point. (Those may be only showing up on Windows, didn't test it rigorously.)

Btw, what were those warnings about? I started seeing some in detray, too, although it had previously stopped after we rolled back to version 34.0 (and now started again)

acts-project / algebra-plugins

feat: SoA layout for vector #95