dwavesystems / dwave-system

An API for easily incorporating the D-Wave system as a sampler, either directly or through Leap's cloud-based hybrid samplers
https://docs.ocean.dwavesys.com/
Apache License 2.0
90 stars 65 forks source link

DWaveSampler returns done() prematurely #297

Open spakin opened 4 years ago

spakin commented 4 years ago

Description Jobs submitted asynchronously to a DWaveSampler à la the example in the from_future documentation may return done() == True before they're actually done.

To Reproduce Consider the following code:

#! /usr/bin/env python

import dimod
from concurrent.futures import ThreadPoolExecutor
from dwave.system import DWaveSampler, EmbeddingComposite
import time

# Ensure some jobs.
njobs = 5
#sampler = EmbeddingComposite(DWaveSampler())
sampler = DWaveSampler()
bqm = dimod.BinaryQuadraticModel.from_ising({}, {(0, 4): -1})
executor = ThreadPoolExecutor()
futures = []
samplesets = []
for i in range(njobs):
    futures.append(executor.submit(sampler.sample, bqm, num_reads=1000, annealing_time=2000))
    samplesets.append(dimod.SampleSet.from_future(futures[i]))
executor.shutdown(wait=False)

# Report when they're finished.
print("Enqueued %d job(s)" % njobs)
start_time = time.time()
while not all([ss.done() for ss in samplesets]):
    pass
print("Done is claimed after %d second(s)" % (time.time() - start_time))
start_time = time.time()
info = []
for i in range(njobs):
    info.append(samplesets[i].info)
print("Have info after %d more second(s)" % (time.time() - start_time))

Run it with sampler = DWaveSampler() as written. Then comment out that line, uncomment the sampler = EmbeddingComposite(DWaveSampler()) line, and run it again.

Expected behavior With sampler = EmbeddingComposite(DWaveSampler()), most of the time is attributed to waiting for all the jobs to finish, as expected:

./premature.py
Enqueued 5 job(s)
Done is claimed after 14 second(s)
Have info after 0 more second(s)

However, with sampler = DWaveSampler(), most of the time is attributed to accessing info, which implies to me that the future is returning done() == True before the job is actually done, and tickling info is forcing the wait for completion:

./premature.py
Enqueued 5 job(s)
Done is claimed after 0 second(s)
Have info after 13 more second(s)

Environment:

arcondello commented 4 years ago

If you change ss.done() to ss._future.done() does the behaviour stay the same? Or do you get an AttributeError?

spakin commented 4 years ago

The behavior stays the same.

arcondello commented 4 years ago

Cool bug! Without an obvious fix.

When a sample set is constructed using from_future, it saves that future as an attribute on itself. So if you change your first while loop to

while not all([ss.done() for ss in samplesets]):
    pass
else:
    print(samplesets[0].done(), type(samplesets[0]._future), samplesets[0]._future.done())

you'll see that the future created by executor.submit(...) is claiming to be done, which SampleSet is then propagating to you.

The reason that the ThreadPoolExecutor's future is claiming to be done is because it is executing the .sample method, which is successfully sampling, then not blocking. So the executor is seeing that job as "done".

arcondello commented 4 years ago

One thing you could try is make a new function

def submit_and_block(*args, **kwargs):
   ss = sampler.sample(*args, **kwargs)
   ss.resolve()
   return ss

then use that

futures.append(executor.submit(submit_and_block, bqm, ...))
spakin commented 4 years ago

I just plugged that into my "real" code, and it works! Thanks for the workaround.