The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
277
stars
15
forks
source link
Update for devel OpenMP #3
Closed
mratsim closed 5 years ago
Following the merging of nim-lang/Nim#9493 the OpenMP annotation string will need to be patched when compiled for 0.19.1/0.20 vs 0.19.
Additionally we can handle both forEach and reduceEach in a unique macro as
nb_chunks: var int
wouldn't need to be passed anymore.forEach would only generate
#pragma omp for
instead of#pragma omp parallel for
and rely on a previous#pragma omp parallel