C2QA / bosonic-qiskit

Extension of Qiskit to support hybrid boson-qubit simulations for the NQI C2QA effort.
BSD 2-Clause "Simplified" License
44 stars 8 forks source link

Clarifying what the state returned by simulate actually is (Edit: Counts information after discretize=True on simulate) #91

Closed liu-zixiong closed 11 months ago

liu-zixiong commented 1 year ago

Hi everyone, I wanted to clarify what exactly is "state" from (state, result) as returned by simulate. https://github.com/C2QA/bosonic-qiskit/blob/main/c2qa/util.py#LL385C11-L385C11

From my understanding, it's the final statevector of the experiment after all the operations have been completed. I'm assuming it is from this statevector that the counts data is computed. However, this is sometimes not the case when I'm retrieving the state information after simulating a circuit.

Here I have a simple circuit with a delay gate and noise pass applied on. The qumode is initialized to fock state |7>, and the last entry from the results after discretizing is extracted, since I understand that to be the final result of the circuit.

import qiskit
import c2qa

qmr = c2qa.QumodeRegister(1, 3)
creg = qiskit.ClassicalRegister(3)
circ = c2qa.CVCircuit(qmr, creg)
circ.cv_initialize(7, qmr[0])

circ.cv_delay(duration=100, qumode=qmr[0], unit="ns")
circ.cv_measure(qmr[0], creg)

noise_pass = c2qa.kraus.PhotonLossNoisePass(photon_loss_rates=0.02, circuit=circ, time_unit="ns")
result= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=True)

state, res = result[-1]
print(state)
print(res)

However, I occasionally get a set of strange results. As you can see the statevector is in fock |2>, but the measurement data from result.data() is clearly counts={'0x0': 1024}, or 1024 samples of the |0> state. This behaviour is non-deterministic, and I'm not entirely sure why the final statevector isn't always "Statevector([ 1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j]" whenever counts={'0x0': 1024}.

Statevector([ 0.+0.j,  0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,
             -0.+0.j],
            dims=(2, 2, 2))
Result(backend_name='aer_simulator', backend_version='0.12.0', qobj_id='', job_id='55809a32-6d51-43cf-a918-0bfc6dadbc1e', success=True, results=[ExperimentResult(shots=1024, success=True, meas_level=2, data=ExperimentResultData(counts={'0x0': 1024}, statevector=Statevector([ 0.+0.j,  0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,
             -0.+0.j],
            dims=(2, 2, 2))), header=QobjExperimentHeader(creg_sizes=[['c2', 3]], global_phase=0.0, memory_slots=3, metadata=None, n_qubits=3, name='circuit-118', qreg_sizes=[['q286', 3]]), status=DONE, seed_simulator=1549333292, metadata={'noise': 'ideal', 'batched_shots_optimization': False, 'remapped_qubits': False, 'parallel_state_update': 1, 'parallel_shots': 4, 'device': 'CPU', 'active_input_qubits': [0, 1, 2], 'measure_sampling': False, 'num_clbits': 3, 'input_qubit_map': [[0, 0], [1, 1], [2, 2]], 'num_qubits': 3, 'method': 'statevector', 'result_types': {'statevector': 'save_statevector'}, 'result_subtypes': {'statevector': 'single'}, 'fusion': {'applied': False, 'max_fused_qubits': 5, 'threshold': 14, 'enabled': True}}, time_taken=0.079272783)], date=2023-06-08T15:25:30.723948, status=COMPLETED, header=None, metadata={'mpi_rank': 0, 'num_mpi_processes': 1, 'num_processes_per_experiments': 1, 'max_gpu_memory_mb': 0, 'time_taken_execute': 0.07938976, 'max_memory_mb': 4096, 'parallel_experiments': 1, 'omp_enabled': True}, time_taken=0.08018302917480469)
tjstavenger-pnnl commented 1 year ago

What Qiskit function/data is called as the state returned depends on how the simulate() function is called (if it has conditional_state_vector or per_shot_state_vector enabled): https://github.com/C2QA/bosonic-qiskit/blob/main/c2qa/util.py#L373-L377

In your example, it should be using state = Statevector(result.get_statevector(circuit_compiled)) which we should verify if that is what you need and expect.

EDIT: Though now as I continue to look at this, I'm not sure that was the issue? I believe the noise model probablistically won't always cause enough photon loss to fock |0>. However, I think you're still expecting the classical bit measure to still match the individual qubits represented in the state vector?

tjstavenger-pnnl commented 1 year ago

Would https://github.com/C2QA/bosonic-qiskit/blob/main/c2qa/util.py#L190 help here?

tjstavenger-pnnl commented 1 year ago

Note too that doing your own measure ahead of using get_counts() may be unexpected as your quantum sate is now collapsed.

liu-zixiong commented 1 year ago

Hi Tim! Sorry that I wasn't clear in the original post. I realised my issue is not in the statevector, but rather the counts information.

Rephrasing the initial question, I have the following cases that I'm facing. cv_measure() is kept within both circuits.

Case 1: Without discretize, the counts information is as expected.

state, res= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=False)

print(c2qa.util.cv_fockcounts(res.get_counts(), [qmr[0]]))
{'5': 2, '4': 7, '1': 394, '3': 59, '2': 221, '0': 341}

Case 2: With discretize, the counts information is rather odd.

result= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=True)
state, res = result[-1]

print(c2qa.util.cv_fockcounts(res.get_counts(), [qmr[0]]))
{'0': 1024}

I'm currently rather unsure as to how to extract out the true counts information from the circuit when using discretize.

liu-zixiong commented 1 year ago

Hi Tim, I also tried testing discretize with per_shot_state_vector, and it seems that I'm not able to activate both args at the same time.

Case 1: Without discretize, I'm able to store per_shot information

state, _= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=False, shots=2, per_shot_state_vector=True)

print(state)
------------------------------
[Statevector([ 0.+0.j,  0.+0.j,  1.+0.j, -0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,
              0.+0.j],
            dims=(2, 2, 2)), Statevector([ 1.+0.j, -0.+0.j,  0.+0.j, -0.+0.j,  0.+0.j,  0.+0.j,  0.+0.j,
             -0.+0.j],
            dims=(2, 2, 2))]

Case 2: With discretize, I get an error.

result= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=True, shots=2, per_shot_state_vector=True)
---------------------------------------------------------------------------
QiskitError                               Traceback (most recent call last)
[/Users/username/bosonic-qiskit/jupyter_clean.ipynb] Cell 6 in 1
     [12] circ.cv_measure(qmr[0], creg)
     [14] noise_pass = c2qa.kraus.PhotonLossNoisePass(photon_loss_rates=0.02, circuit=circ, time_unit="ns")
---> [15] result= c2qa.util.simulate(circ, noise_passes=noise_pass, discretize=True, shots=2, per_shot_state_vector=True)

File [~/bosonic-qiskit/c2qa/util.py:452], in simulate(cvcircuit, shots, return_fockcounts, add_save_statevector, conditional_state_vector, per_shot_state_vector, noise_model, noise_passes, max_parallel_threads, discretize)
    450 sim_circuit = circuit.copy()
    451 sim_circuit.data.clear()  # Is this safe -- could we copy without data?
--> 452 sim_circuit.initialize(previous_state)
    453 last_instructions = circuit.data[-1:]  # Get the last instruction
    455 for inst in last_instructions:

File [/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/qiskit/extensions/quantum_initializer/initializer.py:191], in initialize(self, params, qubits)
    188     qubits = [qubits]
    189 num_qubits = len(qubits) if isinstance(params, int) else None
--> 191 return self.append(Initialize(params, num_qubits), qubits)

File [/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/qiskit/extensions/quantum_initializer/initializer.py:57], in Initialize.__init__(self, params, num_qubits)
     36 def __init__(self, params, num_qubits=None):
     37     r"""Create new initialize composite.
     38 
     39     Args:
   (...)
     55             and the remaining 3 qubits to be initialized to :math:`|0\rangle`.
     56     """
---> 57     self._stateprep = StatePreparation(params, num_qubits)
     59     super().__init__("initialize", self._stateprep.num_qubits, 0, self._stateprep.params)

File [/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/qiskit/circuit/library/data_preparation/state_preparation.py:99], in StatePreparation.__init__(self, params, num_qubits, inverse, label)
     96 self._from_label = isinstance(params, str)
     97 self._from_int = isinstance(params, int)
---> 99 num_qubits = self._get_num_qubits(num_qubits, params)
    101 params = [params] if isinstance(params, int) else params
    103 super().__init__(self._name, num_qubits, params, label=self._label)

File [/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/qiskit/circuit/library/data_preparation/state_preparation.py:198], in StatePreparation._get_num_qubits(self, num_qubits, params)
    196 # Check if param is a power of 2
    197 if num_qubits == 0 or not num_qubits.is_integer():
--> 198     raise QiskitError("Desired statevector length not a positive power of 2.")
    200 # Check if probabilities (amplitudes squared) sum to 1
    201 if not math.isclose(sum(np.absolute(params) ** 2), 1.0, abs_tol=_EPS):

QiskitError: 'Desired statevector length not a positive power of 2.'
tjstavenger-pnnl commented 1 year ago

I have a PR pending with a fix to let you run discretized & pershot at the same time.

When you run discretized, the default behavior is to start each discretized cricuit step with the previous discretized step's final statevector. So the last step (that you get back after the simulation is complete) will start with a state and only have a very short step of the last gate. If you're using photon loss throughout the circuit, and the noise put all qubits into |0> would you expect that to be the end state as well?

kevincsmith commented 12 months ago

I am guessing that the issue here is that the returned counts are for the final discretized circuit chunk only, and these results will therefore be highly dependent on the Statevector that is fed in as input. In other words, the counts aren't reflective of the distribution of possible final Statevectors, but rather a sampling of the Statevector fed into the final discretized step for a single shot.

In the example @liu-zixiong posted, it seems that the Statevector fed into the final discretized step was the vacuum and, as a result, counts returns 100% probability for Fock state |0>. Is there any way to feed in a density matrix in place of a Statevector?

tjstavenger-pnnl commented 11 months ago

Thank you, @kevincsmith, for clarifying. I didn't do as well explaining the same idea.

Qiskit's QuantumCircuit.initialize() doesn't document that you can use DensityMatrix, but it seems to "work" in my local testing. Though test cases are taking significantly longer that I remember them taking before and there are additional test case failures I'm working through. In fairness, it isn't documented to work with state vectors either, and that is what we were doing before.

See https://github.com/C2QA/bosonic-qiskit/actions/runs/5546314033 for build results and https://github.com/C2QA/bosonic-qiskit/tree/discretize-with-densitymatrix for code changes.

tjstavenger-pnnl commented 11 months ago

Stepping back a bit -- is discretize=True useful when not animating? Would it be better to not discretize when you need to get the simulation's counts?

liu-zixiong commented 11 months ago

Hi Tim, I believe there are cases where discretize=True improves simulation accuracy when adding photon loss noise. Since photon loss can happen during the middle of a gate's operation, allowing the discretize feature to work with simulation's counts can be useful.

For instance, here is a circuit that measures the photon number parity on the qumode using $\hat{P} = Exp(i \pi \hat{n} )$. The parity information is then mapped onto the qubit.

qmr = c2qa.QumodeRegister(1, 3)
anc = AncillaRegister(1)
cr = ClassicalRegister(1)
circ = c2qa.CVCircuit(qmr, anc, cr)

circ.initialize([1, 0], anc[0]) # Ancilla in |g>
circ.cv_initialize(3, qmr[0]) # Qumode in |3>

# Photon number parity circuit
circ.h(anc[0])
if manually_discretize:
    for _ in range(10): # Manually discretize cv_c_r gate
        circ.cv_c_r(pi/20, qmr[0], anc[0], duration=0.1, unit="µs")
else:
    circ.cv_c_r(pi/2, qmr[0], anc[0], duration=1, unit="µs")
circ.h(anc[0])
circ.measure(anc[0], cr[0])

# Simulate
noise_pass = c2qa.kraus.PhotonLossNoisePass(photon_loss_rates=0.1, circuit=circ, time_unit="µs")
_, _, counts = c2qa.util.simulate(circ, noise_passes=noise_pass, shots=3000)
print(counts)

With manually_discretize=False, we get the output {'0': 140, '1': 2860}. With manually_discretize=True, we get the output {'0': 396, '1': 2604}.

Without implementing discretize with counts, we won't be able to account for photon loss that happens while gates are operating.

kevincsmith commented 11 months ago

Thanks for the example @liu-zixiong. Said another way, applying noise at the end of the gate is only equivalent to applying it "throughout" in cases where the noise commutes with the gate being applied (i.e., gates that do not change the photon number). In all other cases, it becomes important to discretize in order to get accurate results. Intuitively, this is because the probability of losing a photon is dependent upon the current state of the system (which is changing throughout the gate).

One potential solution (which would be pretty cumbersome, but workable) would be to manually run a single shot simulate for as many shots requested, and then accumulate counts from those runs and return that. @liu-zixiong does that seem workable? You are probably the expert here since you have spent a lot of time messing around with the internal counts dict(). If it is too challenging to alter counts internally, we could alternatively return the correct ensemble data at the level of fockcounts with a warning about the base counts.

liu-zixiong commented 11 months ago

@kevincsmith I looked around and found find a way to alter the counts dict() directly. Taking the output of the simulate() function, you can edit the results via results.data()['counts']['<some state>'] = <some value>. To show this in action, we can edit the counts information from the circuit above.

# Simulate some arbitrary circuit
_, results, _ = c2qa.util.simulate(circ, noise_passes=noise_pass, shots=3000)

# Display original counts information
print("Original counts from results dict", results.data()['counts'])

# Change counts information in results dict
results.data()['counts']['0x0'] = 1025
results.data()['counts']['0x1'] = 1131

# Checking output
print("New counts data from results dict", results.data()['counts'])
print("Feeding new counts data into counts_to_fockcounts()", c2qa.util.counts_to_fockcounts(circ, results))
----------------------------------------------------------------------
Original counts from results dict {'0x1': 2576, '0x0': 424}
New counts data from results dict {'0x1': 1131, '0x0': 1025}
Feeding new counts data into counts_to_fockcounts() {'1': 1131, '0': 1025}

I think accumulating the counts information shot by shot and overwriting results dict with the final counts info within simulate() itself is definitely a viable way of fixing the issue, assuming that it is computationally efficient to do the counts accumulation.

tjstavenger-pnnl commented 11 months ago

Altering the dict after it is returned shouldn't be a problem -- that would be just like altering any other dict in Python. I'm assuming you're going to want to cache the results over time as the discretization simulations are completed?

tjstavenger-pnnl commented 11 months ago

After talking with @kevincsmith today, I re-worked the way we're simulating the discretized circuits. I think they should be more what you need, @liu-zixiong. Could you both take a look at what I have in the branch https://github.com/C2QA/bosonic-qiskit/tree/accumulate-discretized-counts and PR https://github.com/C2QA/bosonic-qiskit/pull/98 to see if it works for your use case?

liu-zixiong commented 11 months ago

Hi Tim, thanks for the re-work. I'm definitely able to access counts information now when I include the discretize=True flag. However there's some odd behaviour that I'm experiencing, and I'm wondering if you have a good intuition for what's going on.

Referring to the number parity circuit mentioned above, we have manually_discretize=False outputs {'0': 111, '1': 2889} manually_discretize=True outputs {'0': 356, '1': 2644}

When I run manually_discretize=False and discretize=True, I get {'0': 134, '1': 2866} Since discretize's default behaviour is to break each gate up into 10 pieces, I'd intuitively expect a match between a manual discretize and simulate(discretize=True). But it seems that the performance of including the discretize=True flag doesn't deviate significantly from just having discretize=False. Do you think that it's because I'm not breaking up the gate into enough pieces?

tjstavenger-pnnl commented 11 months ago

When discretizing the circuits, once past discretizing each gate, the next in the sequence would start with the full previous gate instead of the discretizations. I think that contributed to the difference. https://github.com/C2QA/bosonic-qiskit/commit/31f3f5ac82f882830b94433339847ededc6b1ff2#diff-da63187e8257d94a2a28ee1376208b734c0efc5e93840da63783d9be006f8a94

Do you have a feel for how close the manual vs "auto" versions should be? I'd like to make the assertion in a test case: https://github.com/C2QA/bosonic-qiskit/blob/accumulate-discretized-counts/tests/test_discretize.py#L209-L210. I believe I made the current test check for a 20% tolerance, but that may be too generous?

liu-zixiong commented 11 months ago

If the goal is to improve accuracy when simulating photon loss, then ideally the distributions should be identical between manual vs "auto". In that case, standard error might be a better way to quantify the similarity between the two versions, where there should be an overlap between the mean ± standard error of both distributions. Of course, this is a pretty strict condition, and a tolerance of 5% or 10% might be just as effective in accessing accuracy.

tjstavenger-pnnl commented 11 months ago

The runs are probabilistic, correct? I don't think we'll often get the manual vs automatic to be equal. I've written the test case based on your code to run 20 iterations. The last time it ran, the differences varied from less than 1% difference up to 17%: https://github.com/C2QA/bosonic-qiskit/actions/runs/5763364805/job/15625003394#step:6:227

liu-zixiong commented 11 months ago

Looking at the pytest, the results for manual and automatic looks pretty similar actually. I realise the reason why the differences look so varied is only because we're looking at percentage difference as a benchmark instead of something that assesses the underlying distribution directly (like with mean and standard error). I think what you have in the branch already looks pretty good, thanks for the rework! Will you be merging the branch into main after this?

tjstavenger-pnnl commented 11 months ago

I believe #98 addresses this issue