JamesYang007 / adelie

A fast and flexible Python package for solving group lasso and elastic net problems.
https://jamesyang007.github.io/adelie/
MIT License
16 stars 1 forks source link

Rigorous benchmark of matrix utilities for parallelism #79

Closed JamesYang007 closed 4 months ago

JamesYang007 commented 6 months ago

Most matrix utility functions are in util.hpp under matrix/. We just spotted a huge cost in omp even when n_threads=1! We should do some benchmarks to figure out how to perform the parallelism properly.

JamesYang007 commented 4 months ago
JamesYang007 commented 4 months ago

Found a critical flag of omp! On GOMP, GOMP_SPINCOUNT and on mac brew's libomp KMP_BLOCKTIME should be set sufficiently high that threads don't sleep too soon. OMP is way too costly when waking up the threads.

JamesYang007 commented 4 months ago

OMP_PROC_BIND also helps localize memory since each thread is bound to a single CPU.