seed parameter is not working as expected in SimulatedAnnealingSampler

SadiaAfrinPurba commented 1 month ago

Hi,

I am using the D-Wave simulator to generate samples from a BQM (Binary Quadratic Model). Later, these samples are used for some calculations. Now, I want to ensure that the results are reproducible every time.

Here is the code I am using:

sampler = SimulatedAnnealingSampler()
sampleset = sampler.sample(bqm, chain_strength=chain_strength, num_reads=num_reads, embedding_parameters=dict(timeout=10), seed=random_seed)

solution1 = sampleset.first.sample
solution1_list_final = [v for k, v in solution1.items()]

However, despite using the same seed value, I get different samples in different runs. Below are the outputs:

Run-01

Random seed: 0
[opposite layer] solution1_list_final: [1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0]
Random seed: 0
[opposite layer] solution1_list_final: [1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1]

Run-02

Random seed: 0
[opposite layer] solution1_list_final: [0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0]
Random seed: 0
[opposite layer] solution1_list_final: [0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0]

My expectation is that for the same seed value (e.g., seed = 0), I should get the same values in solution1_list_final across different runs.

arcondello commented 1 month ago

Hi @SadiaAfrinPurba , this repo is deprecated in favour of https://github.com/dwavesystems/dwave-samplers, so I am going to transfer this issue over there.

arcondello commented 1 month ago

Are you using EmbeddingComposite by chance? Are you able to provide your full script and/or the BQM?

SadiaAfrinPurba commented 1 month ago

Hi @arcondello,

No, I am not using EmbeddingComposite. The QPU parameter is set to False, and I am using the local simulator. The code is intended for sampling data points for the Quantum Restricted Boltzmann Machine (QRBM) algorithm.

def sample_opposite_layer_pyqubo(v, layer, weights, opposite_layer,
                                 qpu=False, chain_strength=2, num_reads=1, random_seed=0):
    # print(f"Sampling oppsite layer by pyqubo")
    # initialize Hamiltonian
    H = 0
    H_vars = []

    # initialize all variables (one for each node in the opposite layer)
    for j in range(len(opposite_layer)):
        H_vars.append(Binary(str(j)))

    for i, bias in enumerate(layer):
        # filter only chosen nodes in the first layer
        if not v[i]:
            continue

        # add reward to every connection
        for j, opp_bias in enumerate(opposite_layer):
            H += -1 * weights[i][j] * H_vars[j]

    for j, opp_bias in enumerate(opposite_layer):
        H += -1 * opp_bias * H_vars[j]

    model = H.compile()
    bqm = model.to_bqm()

    if qpu: sampler = EmbeddingComposite(DWaveSampler(token=token))
    else:   sampler = SimulatedAnnealingSampler()

    sampleset = sampler.sample(bqm, chain_strength=chain_strength, num_reads=num_reads, embedding_parameters=dict(timeout=10), seed=random_seed)
    print(f'Random seed: {random_seed}')
    solution1 = sampleset.first.sample
    solution1_list_final = [v for k, v in solution1.items()]

    print(f"[opposite layer] solution1_list_final: {solution1_list_final}")

    return solution1_list_final

arcondello commented 1 month ago

I am missing a few things from your snippet to be able to re-run it myself. Could you add a print(bqm) and send the output?

In the mean time, I am trying to make a minimal reproducible example but so far the seed parameter seems to be working correctly. My example is

import itertools

import dimod
import numpy as np

from dwave.samplers import SimulatedAnnealingSampler

rng = np.random.default_rng(42)

num_variables = 10

h = {v: 0 for v in range(10)}
J = {(u, v): rng.choice((-1, +1)) for u, v in itertools.combinations(range(num_variables), 2)}

bqm = dimod.BQM.from_ising(h, J)
sampler = SimulatedAnnealingSampler()
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)

which for me outputs

{0: 1, 1: -1, 2: -1, 3: -1, 4: 1, 5: 1, 6: -1, 7: 1, 8: -1, 9: 1}
{0: 1, 1: -1, 2: -1, 3: -1, 4: 1, 5: 1, 6: -1, 7: 1, 8: -1, 9: 1}
{0: 1, 1: -1, 2: -1, 3: -1, 4: 1, 5: 1, 6: -1, 7: 1, 8: -1, 9: 1}

as expected. Changing the three sample calls to

print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=43, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=44, num_reads=10).first.sample)

gives different solutions, also as expected.

To confirm, does the above snippet produce repeatable samples for you?

SadiaAfrinPurba commented 1 month ago

Here is the output of print(bqm)


BinaryQuadraticModel({'28': 2.4857529860418666, '9': 0.2779236514569752, '1': -1.4442167607129388, '22': 0.4804691140762303, '21': 0.3390883245862948, '23': 0.6709427065670543, '20': -0.9047421864407115, '24': 1.0621558823537696, '19': -1.8632675541258215, '25': 0.3218134108250361, '2': 0.6947383069651447, '0': -1.0871754013963173, '17': 0.27339634140019053, '27': 0.184146240251158, '7': 1.0785875127069786, '10': 0.36130092792970503, '8': -1.9924667400558835, '16': 0.9987778496222772, '30': 0.5527559203155485, '6': -1.3186347201356727, '3': 2.8472759704472246, '4': -1.595327557577595, '26': -0.5323122709344206, '14': -2.2507832804778847, '5': 1.8540833197825206, '29': 0.5469496673338472, '15': 2.2105932253480542, '13': 0.21111135134337977, '18': 2.4185717674147424, '11': 0.3181408305029323, '31': 3.159497356659715, '12': -0.1411743780361494}, {}, 0.0, 'BINARY')

I am also providing the code for training the QRBM model. The original data contains 32 features.

def train(self, training_data, len_x=1, len_y=1, epochs=50, lr=0.1, lr_decay=0.1, epoch_drop = None, momentum = 0, batch_size = None, exclude_label = False):

        momentum_w = np.zeros((len(self.visible_bias), len(self.hidden_bias)))
        momentum_v = np.zeros(len(self.visible_bias))
        momentum_h = np.zeros(len(self.hidden_bias))

        for epoch in range(epochs):
            random_selected_training_data_idx = random.randint(0,len(training_data)-1)
            v = training_data[random_selected_training_data_idx]

            v = v[:-1] if exclude_label else v

            old_v = v

            h = samp.sample_opposite_layer_pyqubo(old_v, self.visible_bias,
                                                  self.w, self.hidden_bias,
                                                  qpu=self.qpu,
                                                  chain_strength=self.cs,
                                                  random_seed=self.random_seed)

            pos_grad = np.outer(v, h)

            v_prim = samp.sample_opposite_layer_pyqubo(h, self.hidden_bias,
                                                       self.w.T,
                                                       self.visible_bias,
                                                       qpu=self.qpu,
                                                       chain_strength=self.cs,
                                                       random_seed=self.random_seed)

            h_prim = samp.sample_opposite_layer_pyqubo(v_prim,
                                                       self.visible_bias,
                                                       self.w, self.hidden_bias,
                                                       qpu=self.qpu,
                                                       chain_strength=self.cs,
                                                       random_seed=self.random_seed)

            momentum_w = (momentum * momentum_w) + (lr * (pos_grad - neg_grad))

            self.w += momentum_w

            self.visible_bias += momentum_v
            self.hidden_bias += momentum_h
            sample_v = v
            sample_h = samp.sample_opposite_layer_pyqubo(sample_v,
                                                         self.visible_bias,
                                                         self.w,
                                                         self.hidden_bias,
                                                         qpu=self.qpu,
                                                         chain_strength=self.cs,
                                                         random_seed=self.random_seed)

       sample_output = samp.sample_opposite_layer_pyqubo(sample_h,
                                                              self.hidden_bias,
                                                              self.w.T,
                                                              self.visible_bias,
                                                              qpu=self.qpu,
                                                              chain_strength=self.cs,
                                                              random_seed=self.random_seed)

arcondello commented 1 month ago

Ok, even with that BQM I am seeing the expected behavior

import itertools

import dimod
import numpy as np

from dwave.samplers import SimulatedAnnealingSampler

bqm = dimod.BinaryQuadraticModel({'28': 2.4857529860418666, '9': 0.2779236514569752, '1': -1.4442167607129388, '22': 0.4804691140762303, '21': 0.3390883245862948, '23': 0.6709427065670543, '20': -0.9047421864407115, '24': 1.0621558823537696, '19': -1.8632675541258215, '25': 0.3218134108250361, '2': 0.6947383069651447, '0': -1.0871754013963173, '17': 0.27339634140019053, '27': 0.184146240251158, '7': 1.0785875127069786, '10': 0.36130092792970503, '8': -1.9924667400558835, '16': 0.9987778496222772, '30': 0.5527559203155485, '6': -1.3186347201356727, '3': 2.8472759704472246, '4': -1.595327557577595, '26': -0.5323122709344206, '14': -2.2507832804778847, '5': 1.8540833197825206, '29': 0.5469496673338472, '15': 2.2105932253480542, '13': 0.21111135134337977, '18': 2.4185717674147424, '11': 0.3181408305029323, '31': 3.159497356659715, '12': -0.1411743780361494}, {}, 0.0, 'BINARY')

sampler = SimulatedAnnealingSampler()
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)
print(sampler.sample(bqm, seed=42, num_reads=10).first.sample)

My guess is that bqm = model.to_bqm() is not deterministic and resulting in different BQMs each time.

SadiaAfrinPurba commented 1 month ago

Your guess is correct. bqm = model.to_bqm() is non-deterministic, resulting in different BQMs each time. I was using the pyqubo library (https://docs.ocean.dwavesys.com/en/stable/docs_pyqubo.html) to convert the model into a BQM.

Here are the outputs I got using the same seed value:

Run-01

BinaryQuadraticModel({'28': 1.7481486857802937, '9': 0.636251768053611, '1': -2.7482570002453945, '22': 0.4246857417990957, '21': 2.7945886933272863, '23': -1.1872871812713368, '20': -1.010386472903088, '24': 3.051227743776772, '19': -0.01727684402393903, '25': -0.8741565335780186, '2': -0.8558570748091949, '0': -2.2561749249449283, '17': 1.1940271172702066, '27': -3.391096426805616, '7': 2.6370504536815327, '10': 2.7200882037309357, '8': -3.5529834258717394, '16': 0.7918442030294448, '30': 0.3347234126483176, '6': -0.12286650486423278, '3': 0.6015592846634932, '4': -1.8437034255562867, '26': -0.19412297761926123, '14': -1.7425616928258132, '5': 2.634446290485771, '29': -1.6228444425682358, '15': 0.9743367125742353, '13': 0.09157298824756244, '18': -2.883257418932871, '11': -2.0777647286815397, '31': 2.024691743795172, '12': 4.125327178551468}, {}, 0.0, 'BINARY')

Run-02

BinaryQuadraticModel({'28': 3.0213915634296775, '9': 1.947368806124834, '1': -2.477912839590351, '22': 1.1584522322154105, '21': 4.0418241080051605, '23': 2.1707034736805952, '20': -2.1059585137348416, '24': 3.7461861184436223, '19': -1.225606689977668, '25': -1.226385986145118, '2': -2.9317314429967922, '0': 0.3935515238092242, '17': -1.5634668536784813, '27': -3.0452873996119387, '7': 0.5912404015995589, '10': 2.4399740856058134, '8': -5.601639463150018, '16': 0.7602999852087589, '30': -2.00832942337417, '6': 3.554348215944165, '3': 7.623898934503879, '4': -3.826748193753091, '26': -0.6831572779201278, '14': -0.5670401529196474, '5': 3.022155983780988, '29': -1.5806221070835984, '15': 1.0627132418548515, '13': 1.6239872780183493, '18': 1.2269074418474692, '11': -2.111655116733181, '31': 0.3334842085008862, '12': 5.395503729957219}, {}, 0.0, 'BINARY')

Thank you for your insight. You can close the issue as it is not directly related to SimulatedAnnealingSampler()

arcondello commented 1 month ago

Great! Thanks for the bug report!

dwavesystems / dwave-samplers

seed parameter is not working as expected in SimulatedAnnealingSampler #71