Qiskit / qiskit-aer

Aer is a high performance simulator for quantum circuits that includes noise models
https://qiskit.github.io/qiskit-aer/
Apache License 2.0
485 stars 360 forks source link

Segmentation Fault on Pulse Simulator #416

Closed zachschoenfeld33 closed 4 years ago

zachschoenfeld33 commented 4 years ago

Informations

What is the current behavior?

On certain runs of the pulse_simulator, a Segmentation fault: 11 and immediately halts the program. No result is returned. This does not occur every time, however I have been able to reproduce it every few runs using the below code (used in #379 to test frame changes).

Steps to reproduce the problem

Run the following code:

import qiskit
import qiskit.pulse as pulse

from qiskit.compiler import assemble

from qiskit.test.mock.fake_openpulse_2q import FakeOpenPulse2Q
from qiskit.pulse.commands import FrameChange

# setup backend and properties

backend_mock = FakeOpenPulse2Q()
back_config = backend_mock.configuration().to_dict()
system = pulse.PulseChannelSpec.from_backend(backend_mock)
defaults = backend_mock.defaults()

# define the qubits
qubit_0 = 0
freq_qubit_0 = defaults.qubit_freq_est[qubit_0]

qubit_1 = 1
freq_qubit_1 = defaults.qubit_freq_est[qubit_1]

# 1q measurement map (so can measure the qubits seperately)
meas_map_1q = [[qubit_0], [qubit_1]]

# define the pulse time (# of samples)
drive_samples = 100

# Define acquisition
acq_cmd = pulse.Acquire(duration=drive_samples)
acq_0 = acq_cmd(system.acquires[qubit_0],
                        system.memoryslots[qubit_0])

#Get pulse simulator backend
backend_sim = qiskit.Aer.get_backend('pulse_simulator')

# Test frame change where no shift in state results
# specfically: do pi/2 pulse, then pi frame change, then another pi/2 pulse.
# Verify left in |0> state

shots = 10000
# set omega_0, omega_d0 equal (use qubit frequency) -> drive on resonance
omega_0 = 2*np.pi*freq_qubit_0
omega_d0 = omega_0

# set phi = 0
phi = 0

dur_drive1 = drive_samples # first pulse duration
fc_phi = np.pi # fc angle

dur_drive2 = dur_drive1 # same duration for both pulses
omega_a = np.pi/2/dur_drive1 # pi/2 pulse amplitude

# drive pulse (just phase; omega_a included in Hamiltonian)
phase = np.exp(1j*phi)
drive_pulse_1 = SamplePulse(phase*np.ones(dur_drive1), name='drive_pulse_1')
drive_pulse_2 = SamplePulse(phase*np.ones(dur_drive2), name='drive_pulse_2')

# frame change
fc_pulse = FrameChange(phase=fc_phi, name='fc')

# add commands to schedule (pi/2 pulse, then pi frame change, then another pi/2 pulse)
schedule = pulse.Schedule(name='fc_schedule')
schedule |= drive_pulse_1(system.qubits[qubit_0].drive)
schedule += fc_pulse(system.qubits[qubit_0].drive)
schedule += drive_pulse_2(system.qubits[qubit_0].drive)
schedule |= acq_0 << schedule.duration

# Create the hamiltonian
hamiltonian = {}
hamiltonian['h_str'] = []

# Q0 terms
hamiltonian['h_str'].append('-0.5*omega0*Z0')
hamiltonian['h_str'].append('0.5*omegaa*X0||D0')

# Q1 terms
# none

# Set variables in ham
hamiltonian['vars'] = {'omega0': omega_0, 'omegaa': omega_a}

# set the qubit dimension to qub_dim
hamiltonian['qub'] = {'0': qub_dim}

 # update the back_end
back_config['hamiltonian'] = hamiltonian
back_config['noise'] = {}
back_config['dt'] = 1.0 # makes time = drive_samples

back_config['ode_options'] = {} # optionally set ode settings

# set qobj params
qubit_list = [qubit_0]
memory_slots = 1
qubit_lo_freq = [omega_d0/(2*np.pi)]
meas_map = meas_map_1q

back_config['qubit_list'] = qubit_list

# construct the qobj
qobj = assemble([schedule], backend_mock,
                meas_level=meas_level, meas_return='single',
                meas_map=meas_map, qubit_lo_freq=qubit_lo_freq,
                memory_slots=memory_slots, shots=shots, sim_config=back_config)

result = backend_sim.run(qobj).result()
counts = result.get_counts()

What is the expected behavior?

Counts should be {'0': shots}. This occurs sometimes, however other times a segmentation fault results as documented above.

Suggested solutions

Segmentation faults tend to arise from C code so I think the best idea would be a thorough read through of the Cython code in the simulator to see if any memory is not being allocated properly.

atilag commented 4 years ago

I take this

atilag commented 4 years ago

The example circuit is not running, there are dependencies that are not met, like:

  1. All np. (easy to fix by importing numpy)
  2. SamplePulse (from qiskit.pulse import SamplePulse)
  3. qub_dim is not defined (value 2?)
  4. meas_level is not defined (2 level right?)
atilag commented 4 years ago

Found the crash, it's here: https://github.com/Qiskit/qiskit-aer/blob/6c2dd4ed7c42c9d6775293d9832b6779fb3abed9/qiskit/providers/aer/openpulse/cy/channel_value.pyx#L97 This is very unsafe code... anyway, I'm changing all of this and will be fixed soon in my refactor.

zachschoenfeld33 commented 4 years ago

Sounds good thanks @atilag!

singular-value commented 4 years ago

FYI, I am encountering a similar issue. I'm running the latest commit with the openpulse-sim branch (system info: macOS Mojave 10.14.6, running Python 3.7.3, so basically same setup as @zachschoenfeld33).

The segfault behavior is sporadic. I can run the pulse_sim.ipynb. BUT, if I change schedule += cr_rabi_pulse(system.controls[0]) to schedule += cr_rabi_pulse(system.controls[0]) (so that it's trying to drive cross resonance on a different channel, then it fails.

image

Full stacktrace below if helpful. Happy to re-run things or try things to help debug too.

  self.messages.get(istate, unexpected_istate_msg)))
/Users/pranavgokhale/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/scipy/integrate/_ode.py:1009: UserWarning: zvode: Excess work done on this call. (Perhaps wrong MF.)
  self.messages.get(istate, unexpected_istate_msg)))
---------------------------------------------------------------------------
RemoteTraceback                           Traceback (most recent call last)
RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/pranavgokhale/anaconda3/envs/QiskitDevenv/lib/python3.7/multiprocessing/pool.py", line 121, in worker
    result = (True, func(*args, **kwds))
  File "/Users/pranavgokhale/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/openpulse/solver/unitary.py", line 87, in unitary_evolution
    raise Exception(err_msg)
Exception: ZVODE exited with status: -1
"""

The above exception was the direct cause of the following exception:

Exception                                 Traceback (most recent call last)
<ipython-input-26-5b82aaf26998> in <module>
----> 1 sim_result = backend_sim.run(cr_rabi_qobj).result()

~/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/aerjob.py in _wrapper(self, *args, **kwargs)
     39         if self._future is None:
     40             raise JobError("Job not submitted yet!. You have to .submit() first!")
---> 41         return func(self, *args, **kwargs)
     42     return _wrapper
     43 

~/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/aerjob.py in result(self, timeout)
     92             concurrent.futures.CancelledError: if job cancelled before completed.
     93         """
---> 94         return self._future.result(timeout=timeout)
     95 
     96     @requires_submit

~/anaconda3/envs/QiskitDevenv/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
    430                 raise CancelledError()
    431             elif self._state == FINISHED:
--> 432                 return self.__get_result()
    433             else:
    434                 raise TimeoutError()

~/anaconda3/envs/QiskitDevenv/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
    382     def __get_result(self):
    383         if self._exception:
--> 384             raise self._exception
    385         else:
    386             return self._result

~/anaconda3/envs/QiskitDevenv/lib/python3.7/concurrent/futures/thread.py in run(self)
     55 
     56         try:
---> 57             result = self.fn(*self.args, **self.kwargs)
     58         except BaseException as exc:
     59             self.future.set_exception(exc)

~/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/backends/pulse_simulator.py in _run_job(self, job_id, qobj, backend_options, noise_model, validate)
     72             self._validate(qobj, backend_options, noise_model)
     73         openpulse_system = digest_pulse_obj(qobj.to_dict())
---> 74         results = opsolve(openpulse_system)
     75         end = time.time()
     76         return self._format_results(job_id, results, end - start, qobj.qobj_id)

~/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/openpulse/solver/opsolve.py in opsolve(op_system)
     59     montecarlo = OP_mcwf(op_system)
     60     # Run the simulation
---> 61     out = montecarlo.run()
     62     # Results are stored in ophandler.result
     63     return out

~/anaconda3/envs/QiskitDevenv/lib/python3.7/site-packages/qiskit/providers/aer/openpulse/solver/opsolve.py in run(self)
    121                                                   self.op_system.ode_options
    122                                                   ),
--> 123                                        **map_kwargs
    124                                        )
    125             end = time.time()

~/Developer/qiskit/qiskit-terra/qiskit/tools/parallel.py in parallel_map(task, values, task_args, task_kwargs, num_processes)
    137         Publisher().publish("terra.parallel.finish")
    138         os.environ['QISKIT_IN_PARALLEL'] = 'FALSE'
--> 139         return [ar.get() for ar in async_res]
    140 
    141     # Cannot do parallel on Windows , if another parallel_map is running in parallel,

~/Developer/qiskit/qiskit-terra/qiskit/tools/parallel.py in <listcomp>(.0)
    137         Publisher().publish("terra.parallel.finish")
    138         os.environ['QISKIT_IN_PARALLEL'] = 'FALSE'
--> 139         return [ar.get() for ar in async_res]
    140 
    141     # Cannot do parallel on Windows , if another parallel_map is running in parallel,

~/anaconda3/envs/QiskitDevenv/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
    655             return self._value
    656         else:
--> 657             raise self._value
    658 
    659     def _set(self, i, obj):

Exception: ZVODE exited with status: -1
nonhermitian commented 4 years ago

This is saying that the underlying ODE solver needed to take more steps than max_steps. However, max_steps is set 100x higher than the default so that the user need not have to change this is general: https://github.com/Qiskit/qiskit-aer/blob/068e70ade5d8eab369e6b97b2df5b2eb7c9ff5b3/qiskit/providers/aer/openpulse/solver/options.py#L56

So, this is saying that the input is not quite correct, or perhaps that there a non-smooth or discontinuous function change that is causing this.

As a general rule, all these errors can be found here:

http://www.netlib.org/ode/zvode.f

nonhermitian commented 4 years ago

Likely fixed by #451

singular-value commented 4 years ago

Thanks. FYI, does seem to be fixed for me now.

On Sat, Nov 16 2019 at 6:23 AM, notifications@github.com wrote:

Likely fixed by #451 https://github.com/Qiskit/qiskit-aer/pull/451

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Qiskit/qiskit-aer/issues/416?email_source=notifications&email_token=AA6KW4HUPG7Q6JYNE2XWFWTQT7Q4RA5CNFSM4JG6LVWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHQIXY#issuecomment-554632287, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6KW4GZSTR6SE6TUZJDNDLQT7Q4RANCNFSM4JG6LVWA .