oscarhiggott / PyMatching

PyMatching: A Python/C++ library for decoding quantum error correcting codes with minimum-weight perfect matching.
Apache License 2.0
185 stars 32 forks source link

Feed in a large amount of experiment data for one time? #16

Closed inmzhang closed 3 years ago

inmzhang commented 3 years ago

For example, I get more than 500,000 stabilizer results after running a single QEC experiment or simulation repeatedly. I want to decode this large amount of data efficiently. In my implementation, I input one set of exp data into a fixed matching for one time, which takes relatively long time to decode all 500,000 results. I wonder whether there is more efficient way to do this.

m = pymatching.Matching(graph)
for exp in exps:
   correction = m.decode(exp)
   results.append(correction)
inmzhang commented 3 years ago

I used a kind of clumsy method to deal with this. I divide every n = 100 experiments into a group, then there are 500,000/100 groups. I create a new matching object using a group and decode for every group. But when n grows, this method converge then get worse. It should be that creating a matching object with big n is time-consuming.

oscarhiggott commented 3 years ago

Hi @inmzhang123, without seeing the rest of your code or timing/profiling information I'm not sure exactly what the bottleneck/problem is, so if it's easy to post a complete working example that demonstrates the problem (or timing data), that would help.

How large are your matching graphs, and how large is each correction array? Could it be you are running out of RAM if each instance is large? If so you could try removing the results.append(correction) line and see if that speeds things up - you may not need to store all the results anyway if all you need is a tally of success and failures.

inmzhang commented 3 years ago

Thanks for replying! @oscarhiggott I run a d=3,round=2 surface code for 500,000 shots and decode for X errors. The matching graph has 8 detectors,1 boundary. So every time I call decode() I pass in parameter like (2*4,1) nd array for total 500,000 calls. I used only one qubit label to see whether logical error happend, so the correction returned is a single bool. It consume 25s to decode total 500,000 shots, I want to short the time further. Maybe you have some excellent suggestion. Sorry for the format I type in. I have to use my phone to do this right now.

oscarhiggott commented 3 years ago

I just ran a couple of tests for surface code circuits with the same dimensions as you and also get a decoding time of around 50 microseconds per sample with the default setting (using 1% circuit-level depolarising noise). That seems reasonable/expected to me so I don't think there is any unexpected behaviour. Since the circuits are small, it's a little faster to use exact matching by setting num_neighbours=None in Matching.decode (since this caches the shortest paths). That reduces the runtime to around 35 microseconds for me. There are further optimisations to MWPM that can be made, but if you need to reduce the runtime further you might want to use another algorithm, such as Union-Find. How much faster do you need it to be for your use case?

inmzhang commented 3 years ago

Setting num_neighbours=None really helps! It shorts the time to decode 500,000 shots from 25s to 10s and that's enough for me. I'll close this issue and thanks again for your work.

oscarhiggott commented 1 year ago

Revisiting this issue using decode_batch in PyMatching v2.1.0 (with the new blossom implementation) and running on a distance=3, rounds=3 dataset from the Google surface code experiment. Running:

import os
import time
import stim
import pymatching

fullpath = os.path.join(experiment_path, "surface_code_bX_d3_r03_center_3_5")
dem = stim.DetectorErrorModel.from_file(os.path.join(fullpath, "pij_from_even_for_odd.dem"))
shots = stim.read_shot_data_file(path=os.path.join(fullpath, "detection_events.b8"), format="b8", num_measurements=0, num_detectors=dem.num_detectors, num_observables=0)
actual_observables = stim.read_shot_data_file(path=os.path.join(fullpath, "obs_flips_actual.01"), format="01", num_measurements=0, num_detectors=0, num_observables=1)
matching = pymatching.Matching.from_detector_error_model(dem)
t1 = time.time()
for i in range(10):
    predicted_observables = matching.decode_batch(shots)
t2 = time.time()
print(t2-t1)

where fullpath is the path to the google_qec3v5_experiment_data directory now takes 0.3 seconds on my M2 machine. I ran 10 loops since the dataset is 50k shots, not 500k.