Return timing info for simulated annealing

kevinchern commented 1 year ago

PR for #22 starting with simulated annealing.

kevinchern commented 1 year ago

Need to document timing info (in doc string)

codecov[bot] commented 1 year ago

Codecov Report

Merging #54 (cf8269d) into main (3760f49) will increase coverage by 0.06%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #54      +/-   ##
==========================================
+ Coverage   92.52%   92.59%   +0.06%     
==========================================
  Files          17       17              
  Lines         535      540       +5     
==========================================
+ Hits          495      500       +5     
  Misses         40       40

Impacted Files	Coverage Δ
dwave/samplers/sa/sampler.py	`83.45% <100.00%> (+0.64%)`	:arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

kevinchern commented 1 year ago

I am using process_time to time each step. Do we have a case for using perf_counter instead?

arcondello commented 1 year ago

I am using process_time to time each step. Do we have a case for using perf_counter instead?

I have no strong feelings. I have usually used perf_counter, but defer to the benchmarking team as to which is preferable. FWIW, I don't think accounting for the process sleeping is very important.

randomir commented 1 year ago

If we're trying to capture a wallclock time here (and I hope we do), then process_time is off the table. The only viable candidates, IMO, are time.monotonic and time.perf_counter (or even better, their _ns counterparts). And since perf_counter is already monotonic, it's the best choice here.

kevinchern commented 1 year ago

Should we include "units" a key in the timing dict and make other keys consistently named without units? i.e.,

dict(timing=dict(
    units="nanoseconds",
    preprocessing=...,
    sampling=...,
    postprocessing=...,
))

instead of

dict(timing=dict(
    preprocessing_ns=...,
    sampling_ns=...,
    postprocessing_ns=...,
))

arcondello commented 1 year ago

I prefer the latter over the former. And by using _ns we're consistent with the time library which seems like a reasonable standard to follow if we're in-fact using the _ns methods.

kevinchern commented 1 year ago

OK this should be ready for another review.

Summary:

Timings standardized using a Stopwatch class
Timing information is returned as a nested dict in the sample set's info: dict(timing=dict(preprocessing_ns=..., ))

arcondello commented 1 year ago

Or, even better leave the phase names out of it completely, and just return periods in a list.

My understanding of the intention was to have the names/timer-type/units defined in a single place so that the samplers all follow the same pattern. So having the names encoded is part of the design.

I would leave the names in, but make the stopwatch easier to use by handling cases like switching over timings. It may will be that there is a sampler that has a pattern like

for _ in range(num_samples):
    stopwatch.start_preprocessing()

    # do preprocessing here

    stopwatch.start_sampling()

    # do sampling here

    stopwatch.start_postprocessing()

    # do postprocessing here

stopwatch.stop()

in which case having things like accumulation handled is a nice feature. This also abstracts things like the specifics of which timer, the format of the dict etc from the developer which may be desirable.

But my favorite option would be to keep this simple and stupid, at least until we start using it, and develop a better sense for requirements.

I agree that stupid is also good at an early phase. If the intention of this PR is to only implement timing for SA, I would just do it the dumb way there. If this PR is intended to cover all of the samplers then having a more abstract class makes sense to me.

arcondello commented 1 year ago

But my favorite option would be to keep this simple and stupid, at least until we start using it, and develop a better sense for requirements.

I agree that stupid is also good at an early phase. If the intention of this PR is to only implement timing for SA, I would just do it the dumb way there. If this PR is intended to cover all of the samplers then having a more abstract class makes sense to me.

To follow up on this, given the current name of the PR "Return timing info for simulated annealing" I do think that it may make sense to move the StopWatch stuff out into a separate PR. Maybe after we've implemented timing for 2+ samplers and have a better feel for the edge cases.

For your current needs, do you just need SA timing? Or other samplers as well? And if so, which ones?

kevinchern commented 1 year ago

I need timing info for neal and greedy. Maybe random in the near future.

Let me revert the changes to use the unabstracted implementation. I'll put in a PR for the stopwatch when there's a need for it.

kevinchern commented 1 year ago

Summary:

reverted to the simple implementation and added tests.
The stopwatch approach will be in a separate PR if/when needed

arcondello commented 1 year ago

One last one last thing, needs a release note

kevinchern commented 1 year ago

Added in https://github.com/dwavesystems/dwave-samplers/pull/54/commits/8db81ee9535c5b45837eef5f48e92529f0f9502a

dwavesystems / dwave-samplers

Return timing info for simulated annealing #54

Codecov Report