Loosen benchmark time check

microsoft / knossos-ksc

Compiler with automatic differentiation

Other

45 stars 10 forks source link

Loosen benchmark time check #913

Closed cgravill closed 3 years ago

cgravill commented 3 years ago

Only abort if even a single round is over the max_time. Previously it enforced rounds * duration < max_time but that was getting in the way, see #907

This check prevents incredibly slow benchmarks blocking nightly runs but hopefully gets in the way less.

Note this may want revising further later, we really want per benchmark time limiting which isn't configurable on pytest-benchmark as-is.

awf commented 3 years ago

Roughly this: 5 seconds should be enough to get a reasonable time estimate even for one round. However, there may be other factors, so we would like at least 10 rounds. Call it 12 to get an upper bound of a minute.

Then ignore the "at least 12 rounds" -- 1 round will be fine for something that took 55 seconds, but the interquartile range and variances should be displayed as Inf.

So IMO the logic should be, to benchmark f

    max_time_per_round = 5 sec
    with timeout(max_time_per_round * 12):
        try
           t = time(f())
        except TimeOutError() 
           # A Single run took longer than max_time_per_round 
        except e:
           # Some other error

        nrounds = ceil(max_time_per_round * 12 / t)

awf commented 3 years ago

Of course, this now means all benchmarks take 1 minute, and we may want a further constraint that says any benchmark for which t < 0.1 sec should not take more than e.g. 10 sec?

toelli-msft commented 3 years ago

Thanks, this is helpful. Can I suggest a change to the phrasing? I found the meaning of "Duration of even a single benchmark" hard to grasp initially. How about

The Duration({duration}) a single benchmark round would be longer than max_time({benchmark._max_time}), aborting."

cgravill commented 3 years ago

Hmm, that doesn't read smoothly to me either. I wanted to communicate that we've run the given function once, and it's longer than the maximum time for the whole set of benchmarks. Does this read any clearer to you?

The duration({duration}) of a single run of the benchmark function was longer than max_time({benchmark._max_time}), aborting.

cgravill commented 3 years ago

@awf I've added your pseudocode onto: AB#19865

The max timing for a function might either require signals (haven't tried) or multiprocessing which would be a bit involved. The strict check I've put in is getting in @toelli-msft way benchmarking larger data examples so I suggest to first ease off that then make further improvements.

Incidentally we can use benchmark._min_rounds instead of embedding the 12.

toelli-msft commented 3 years ago

Does this read any clearer to you?

Yes, much better, thanks!