dwavesystems / dwave-cloud-client

A minimal implementation of the REST interface used to communicate with D-Wave Solver API (SAPI) servers.
https://docs.ocean.dwavesys.com/projects/cloud-client/en/stable/
Apache License 2.0
59 stars 40 forks source link

Optimize `StructuredSolver.check_problem()` #487

Closed randomir closed 2 years ago

randomir commented 2 years ago

Currently problem graph check is naively implemented and relatively slow.

import dimod
from dwave.cloud.solver import StructuredSolver
from dwave.cloud.testing import mocks

C16 = StructuredSolver(data=mocks.qpu_chimera_solver_data(16), client=None)
P16 = StructuredSolver(data=mocks.qpu_pegasus_solver_data(16), client=None)

bqm_c = dimod.generators.ran_r(1, (C16.nodes, C16.edges))
bqm_p = dimod.generators.ran_r(1, (P16.nodes, P16.edges))
In [1]: %timeit C16.check_problem(bqm_c.linear, bqm_c.quadratic)
5.04 ms ± 89.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [2]: %timeit P16.check_problem(bqm_p.linear, bqm_p.quadratic)
36 ms ± 871 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
randomir commented 2 years ago

OTOH, large chunk of time is taken by BQM traversal/conversion. With:

h = dict(bqm_p.linear)
J = dict(bqm_p.quadratic)

We get:

In [1]: %timeit P16.check_problem(h, J)
13.5 ms ± 119 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Comparing that to a fast superset check:

In [2]: %timeit P16.nodes.issuperset(h) and P16.edges.issuperset(J)
8.97 ms ± 95.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

looks like there's not much we can gain here.

arcondello commented 2 years ago

To add a tiny bit more context. First, and this was probably obvious, it's the quadratic that's dominating the time

In [9]: %timeit dict(bqm.linear)
2.87 ms ± 20.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [10]: %timeit dict(bqm.quadratic)
38 ms ± 242 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Second, about 50% of the overhead is coming from the abstractions of the quadratic view.

In [11]: %timeit dict(((u, v), bias) for u, v, bias in bqm.iter_quadratic())
20.1 ms ± 84.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

I can take a look at the quadratic view to see what can be done to speed that up a bit.

Though worth noting that the check_problem() function ends up using .items() https://github.com/dwavesystems/dwave-cloud-client/blob/ba57577dc4b01de28d61841a4070a8f9628e7b8b/dwave/cloud/utils.py#L122 and that's not so bad

In [13]: %timeit dict(bqm.quadratic.items())
19.5 ms ± 172 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
randomir commented 2 years ago

Useful observation about the quadratic view overhead, I haven't checked that.

Ultimately, shouldn't matter here, as we actually only need set(keys), but good to know.

In [1]: %timeit dict(bqm_p.quadratic)
40.4 ms ± 408 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [2]: %timeit set(bqm_p.quadratic)
21.5 ms ± 540 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit set(bqm_p.quadratic.keys())
22.2 ms ± 452 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

On that note, going forward, to enable performance improvements, we'll need to make sure native/low-level bqm can be used (and is used) end-to-end. Right now we do too many bqm conversions along the way.