Closed JamesYang007 closed 4 months ago
utils.hpp
. API should be like xxx(..., n_threads)
.Found a critical flag of omp! On GOMP, GOMP_SPINCOUNT
and on mac brew's libomp KMP_BLOCKTIME
should be set sufficiently high that threads don't sleep too soon. OMP is way too costly when waking up the threads.
OMP_PROC_BIND
also helps localize memory since each thread is bound to a single CPU.
Most matrix utility functions are in
util.hpp
undermatrix/
. We just spotted a huge cost inomp
even whenn_threads=1
! We should do some benchmarks to figure out how to perform the parallelism properly.