BallisticLA / RandBLAS

A header-only C++ library for sketching in randomized linear algebra
https://randblas.readthedocs.io/en/stable/
Other
75 stars 6 forks source link

WIP -- Rng types #25

Closed burlen closed 1 year ago

burlen commented 1 year ago

some experiments w/ random123. These have been included into #28. Note: in #28 I squashed the commits marked squash here.

burlen commented 1 year ago

rand123_gen_time compiled with gcc -O3 -march=native -mtune=native on a Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz. this is an older CPU.

The AESNI and ARS runs used the 4x32 types not the 1xm128i type because the latter does not work with r123::boxmuller nor r123::uneg11. It is possible that ARS with the 1xm128i type may be the fastest.

burlen commented 1 year ago

philox_gen_unif_gen_norm_omp_scaling

OpenMP parallel implementation has perfect strong scaling. Tested on a system with 10 physical cores.

burlen commented 1 year ago

phi_vector_sizes 32 bit types are faster than 64 bit types. this makes sense because of vectorization. 2x are the same as 4x types possibly because of inlining and loop unrolling optimizations result in exactly the same code for either.

burlen commented 1 year ago

was merged in #28