sarah-quinones / gemm

MIT License
76 stars 11 forks source link

[Question] Suggested way to use Parallelism for libraries using gemm? #4

Closed coreylowman closed 1 year ago

coreylowman commented 1 year ago

The usages I'm seeing are:

For a general purpose library that is not exposing gemm to external users, how should parallelism be configured?

coreylowman commented 1 year ago

For now I'm using rayon::current_num_threads().

Another question to tack on is related to batches of matrices. Should I use rayon to parallelize across the batch, or parallelize each individual gemm call and keep the items sequential?

sarah-quinones commented 1 year ago

Rayon(0) was added later, and does the same thing as Rayon(rayon::current_num_threads()), so either is fine. For parallelism, i haven't experimented much with that but my guess is that the higher the level of parallelism, the better. so if you can parallelize across the batch that might give better results. or you can try to do a mix of both. for example if your batch size isn't large enough to make efficient use of all the cpu cores

coreylowman commented 1 year ago

Thanks, seems like this will be determined at library level then. Will close this out!