Closed coreylowman closed 1 year ago
For now I'm using rayon::current_num_threads()
.
Another question to tack on is related to batches of matrices. Should I use rayon to parallelize across the batch, or parallelize each individual gemm call and keep the items sequential?
Rayon(0)
was added later, and does the same thing as Rayon(rayon::current_num_threads())
, so either is fine.
For parallelism, i haven't experimented much with that but my guess is that the higher the level of parallelism, the better. so if you can parallelize across the batch that might give better results. or you can try to do a mix of both. for example if your batch size isn't large enough to make efficient use of all the cpu cores
Thanks, seems like this will be determined at library level then. Will close this out!
The usages I'm seeing are:
rayon::current_num_threads()
used a couple places in gemmRayon(0)
, does this mean something special or is rayon with 0 threads (i guess sequential?)For a general purpose library that is not exposing gemm to external users, how should parallelism be configured?