ipython / ipython

Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
https://ipython.readthedocs.org
BSD 3-Clause "New" or "Revised" License
16.25k stars 4.43k forks source link

Allow Choice of Summary Statistic(s) in %timeit #10712

Open mfenner1 opened 7 years ago

mfenner1 commented 7 years ago

See also PR https://github.com/ipython/ipython/pull/9984

Within the last year, ipython (and hence, Jupyter) went to using a mean +/- standard deviation reporting mechanism. This is a different strategy than is recommended in the core Python docs (see the Note under https://docs.python.org/3.6/library/timeit.html#timeit.Timer.repeat) and by core Python developers (e.g., Raymond Hettinger on SO: https://stackoverflow.com/a/8220943). I'm merely noting the difference in strategies, I'm intentionally not making a claim as to when and where one of these strategies would be preferred.

It would great to see %timeit and %%timeit support a few different options for summary statistics. Several simple cases come to mind:

  1. Mean +/- StdDev (current behavior and used by perf referenced in https://github.com/ipython/ipython/pull/9984)
  2. Median +/- MAD (mean absolute deviation) - a "statistically robust" option to 1
  3. Best (Python docs recommendation)
  4. Worst
    1. None. No summary, report all of the numbers (perhaps sort them first).

Now, a user could implement each of these by accessing the returned TimeitResult (https://github.com/ipython/ipython/blob/f2d2913c977e412befcf1a82da2f82b6ad3d8a27/IPython/core/magics/execution.py#L56). However, a switch on the cell magic would make this trival with an API like:

%timeit -s mean
%timeit -s median
%timeit -s best
%timeit -s worst
%timeit -s none

I think this would be a nice feature and it would allow ipython and jupyter devs to be mostly agnostic about "the right" way to measure performance (up to a possibly controversial default summary method). If I were to vote, I'd prefer to give no summary. Make the user think about what that distribution or times means. If the user wants one, the user would have to specify one. Or, perhaps as long as repeats is less than 10 or so, have no summary.

mforbes commented 1 year ago

I second this: access to best is important when running timeit on loaded computers running other programs. Some sort of characterization of the variation would still be useful.