Closed jknowles closed 7 years ago
Thanks for the comments.
It's been on my mind, I've just never got round to it. So a PR would be welcome.
If you have an idea of what such a benchmark would look like, I'm all ears. I'm thinking that it would detect the number of cores, run some of the existing programming benchmarks on powers of two. E.g. if you had 8 cores, it would run the benchmark on 1, 2, 4, 8 cores.
Hi @csgillespie
I am going to try to start this on the multicore
branch on my fork.
Are you opinionated about how to do the parallel version of the testing? I see two approaches:
foreach
RcppParallel
functions that do the same matrix calculations, but in parallelOption 1 is much easier to implement and is where I am starting. If there is a good reason to prefer option 2, I'm willing to dive into doing that, but it will take much longer for me to implement some of the benchmarks that way...
Since we're interested in how standard R code scales across cores, using existing benchmarks would be fine. A few thoughts:
foreach
is cross platform?I will double check foreach is crossplatform, but I dev on Windows and Windows is usually the laggard right? It works great on Windows.
There are some issues where Linux/Mac can support thread forking, which is more efficient. I'm mostly concerned personally with accurately benchmarking Windows performance, but I do not think it would be too hard to extend the work to allow thread forking on Mac/Linux platforms.
One thing it will do is add a bunch of dependencies. Should we make them "suggests" or go for it and make them dependencies.
I've used it on Linux, so it sounds like it will be fine.
Regarding dependencies, go for standard imports. Once we have a skeleton, we can re-evaluate the situation.
Cool. Do you want me to give you PR when I have some minimal multicore test in place to check it out? I started with the matrix calculation benchmark because it was one I was most interested in.
Yes please.
Quick question @csgillespie
Would you prefer that users can compare benchmark_std()
directly to MC benchmark, or would you prefer an MC benchmark that tells users performance for the same workflow at cores through a sequence of powers of 2?
The latter seems easier because the foreach
overhead in the way I am doing MC probably adds something that is not picked up in benchmark_std()
.
I think the latter is more sensible. We don't really care about comparing serial with parallel using a single core.
We would run the benchmark in parallel with one core but only to normalise the results.
Hi,
Great package -- I am so glad someone has done this.
For some workflows it might be nice to have a sense of multicore/multithread performance.
Any plans to add that?
If it could be done it would be a great addition to the package. If not, might you accept a pull request if I can find time to do it?