psf / pyperf

Toolkit to run Python benchmarks
http://pyperf.readthedocs.io/
MIT License
809 stars 79 forks source link

Incorrect standard error #203

Closed fzakaria closed 1 month ago

fzakaria commented 1 month ago

https://github.com/psf/pyperf/blob/bf02a8b7cc45a47897359681c844b4c507afc552/pyperf/_cli.py#L290

The docs link to https://en.wikipedia.org/wiki/Standard_error which claim it's: standard deviation / square-root(n).

pyperf seems to be only reporting the standard deviation?

Not sure how to use this information along with the t-score to get my confidence intervals.

vstinner commented 1 month ago

Did you try the https://pyperf.readthedocs.io/en/latest/cli.html#stats-cmd command?

The https://pyperf.readthedocs.io/en/latest/cli.html#compare-to-cmd command computes also the t-test:

pyperf determines whether two samples differ significantly using a Student’s two-sample, two-tailed t-test with alpha equals to 0.95.

fzakaria commented 1 month ago

I tried those commands; they do show the t-test value. My issue I opened was because the documentation claims that the stdev shown is the "Standard Error" (it even links to the wiki page for Standard Error) but is in fact the sampled stdev.

image

Showing it as +- is a bit incorrect I felt. It probably needs to then be used with the t-score for a 95% confidence interval? Anyways; if i'm wrong since i'm also poor at statistics that also makes sense. I was using pyperformance to benchmark some stuff and wanted to highlight something I found confusing.

vstinner commented 1 month ago

Oh, stdev() is the standard deviation: https://docs.python.org/dev/library/statistics.html#statistics.stdev

fzakaria commented 1 month ago

Right and that's being reported as the Standard Error on the API was my read.

Screenshot 2024-09-16 at 1 49 07 PM
vstinner commented 1 month ago

I think that there is a mistake in the link: I wrote https://github.com/psf/pyperf/pull/204 to fix the doc.