RFC: Add pierreglaser/sklearn_parallel_benchmark to scikit-learn_benchmarks

pierreglaser commented 5 years ago

Hi,

This year I created a repository called sklearn_parallel_benchmarks. It benchmarks the performance of many scikit-learn estimators when fitted in parallel, either using threads or processes.

This benchmark suite allowed to spot several regressions, and this led to fixes to scikit-learn master. Examples:

scikit-learn/scikit-learn#13389
scikit-learn/scikit-learn#13310

On the other side, it is possible that @jeremiedbb's benchmark suite (this repository) becomes the official benchmark suite of scikit-learn, and eventually gets transfered to the scikit-learn organisation.

I suggest thus to add my benchmark suite to this repository. As those two benchmark suites are still WIP, and that there are still some cleaning to do on both sides, we could first register my benchmark suite as a submodule of scikit-learn_benchmarks, to ensure fast and independent updating of the benchmark suites. Once the two suite converge, we can discuss other vendoring options.

Here is a vague TODO list to begin with:

[x] added pierreglaser/sklearn_parallel_benchmarks as a submodule of jeremiedbb/scikit-learn_benchmarks in an experimental branch
[ ] clean up both repositories
[ ] add jeremiedbb/scikit-learn_benchmarks to the scikit-learn org

rth commented 4 years ago

Yes, this would be a very good idea.

add jeremiedbb/scikit-learn_benchmarks to the scikit-learn org

Scikit-learn issue about this is https://github.com/scikit-learn/scikit-learn/issues/16723

jjerphan commented 4 years ago

I guess this PR can be closed thanks to https://github.com/scikit-learn/scikit-learn/pull/17026?

jeremiedbb / scikit-learn_benchmarks

RFC: Add pierreglaser/sklearn_parallel_benchmark to scikit-learn_benchmarks #16