psf / pyperf

Toolkit to run Python benchmarks
http://pyperf.readthedocs.io/
MIT License
797 stars 78 forks source link

Include pystats during warmups #170

Closed mdboom closed 11 months ago

mdboom commented 11 months ago

On recent investigations, it's become clear that the support for py_stats in pyperf (that I myself originally submitted) isn't ideal, since stats aren't collected during warmups. Since warmups run in the same process as the "main" runs, if any optimizations or specializations are performed during the warmup, they aren't recorded in the stats.

This change simply includes warmups in stats collection. Calibration runs are left as is, since they are run in a separate process.

markshannon commented 11 months ago

It seems odd to have the stats different from the times. I approve of removing "warmup", but I'd like the stats and times to be for the same thing.

vstinner commented 11 months ago

pyperf stores timing of each "run", but if I understand correctly, stats are only cumulated for all runs, maybe even for all worker processes, no?

mdboom commented 11 months ago

I approve of removing "warmup", but I'd like the stats and times to be for the same thing.

That was the goal when pystats support was originally added to pyperf, but that hides a lot of the specialization/optimization work. There's a discussion of what happens when warmups are included in the timings here. I'm not sure it's worth it: https://github.com/faster-cpython/bench_runner/issues/61

mdboom commented 11 months ago

pyperf stores timing of each "run", but if I understand correctly, stats are only cumulated for all runs, maybe even for all worker processes, no?

If you mean we don't collect stats for each run separately, but instead collect them for the whole run of the process (excluding the pyperf / pyperformance "harness"), you are correct. It means we can't see what each run did, and if earlier runs behave differently than later ones etc. It probably could be done, but would add a lot of complexity / generate a lot more data etc.