Open rgommers opened 3 years ago
Do you have pointers on how those benchmarks should look like? Is there a preferred set of problems to test code in? See, for example https://julialang.org/benchmarks/
The idea of that Julia page is about right I think, short and with only a plot, no code. The main things I'd change from that:
Off the top of my head I'm not sure about an existing widely used set of benchmarks to adopt.
Thanks @rgommers for opening this issue!
Few remarks:
I guess it's better to keep things reasonably simple in this page so that people can have a quick overview of what can be done. I wouldn't consider too many problems.
I think it is better to consider full problems of existing benchmark games (for example http://initialconditions.org/ or https://benchmarksgame-team.pages.debian.net/benchmarksgame/, code here) and not only tiny micro-benchmarks (like in https://julialang.org/benchmarks/) to see the code in quasi-real-life situations (meaning not only few functions defined in a Jupyter notebook). It would be interesting to also mention other aspects than elapsed times, for example readability, size of the files, technical difficulties, time of coding, maintainability, etc. Optimizing is always a balance.
One advantage of Python is that it's possible to go steps by steps from very simple implementations (sometimes not very efficient) to more complex (and more efficient) ones. It would be nice to be able to show that. The N-Body problem is a good example.
I don't think it is necessary to compare Transonic and Pythran. By default Transonic uses Pythran so both tools will have the same performance in the end. Transonic just makes Pythran easier to use for real life coding (except in Jupyter notebooks), with a Python API similar to Numba API and using Python type annotations. Transonic can also use Numba and Cython as backends but it's another story and I don't think it is necessary to go into such details for this page.
The N-Body problem can be a good example
It's also interesting to give at least one example using OpenMP.
This article https://onlinelibrary.wiley.com/iucr/doi/10.1107/S1600576719008471 is very interesting and serious. It should be cited.
It would be good to send two important messages in terms of performance: (i) no premature optimization and (2) measure, don't guess. We can at least mention CProfile.
It would be good to also honestly present some limitations of this strategy of acceleration of Python codes.
It would be interesting to also mention other aspects than elapsed times, for example readability, size of the files, technical difficulties, time of coding, maintainability, etc. Optimizing is always a balance.
That's a good point, yes.
It's also interesting to give at least one example using OpenMP.
I don't think I'd want to get into that, on the same page at least. Because then we'd also have to touch on other forms of parallelism (e.g. Dask, multiprocessing, asyncio).
This article https://onlinelibrary.wiley.com/iucr/doi/10.1107/S1600576719008471 is very interesting and serious. It should be cited.
Thanks, I wasn't aware of this article. It's really well-written.
It would be good to send two important messages in terms of performance: (i) no premature optimization and (2) measure, don't guess. We can at least mention CProfile.
I think the page really should focus on performance, rather than turning into a tutorial. So this can be one line to one paragraph, but it should link elsewhere for things like profiling.
Adding links to the recent Nature correspondence by @paugier et al.:
Related to https://github.com/numpy/numpy.org/issues/308#issuecomment-634612765 (connect content to "key features" on front page).
Adding content on benchmarks and accelerators (i.e. Cython, Numba, Pythran, Transonic) was also just suggested in https://mail.python.org/pipermail/numpy-discussion/2020-November/081248.html