Qiskit / qiskit

Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
https://www.ibm.com/quantum/qiskit
Apache License 2.0
4.84k stars 2.29k forks source link

Investigate porting unitary synthesis to rust #8774

Open mtreinish opened 1 year ago

mtreinish commented 1 year ago

What should we add?

For level 3 compilation now, the unitary synthesis as part of the 2q block optimization ~becomes a large bottleneck~ (this was mostly supposition on my part looking at the profile we spend more time in consolidate blocks than synthesis) since we've moved layout and routing mostly to multithreaded rust code. To look at further speeding this up we should look at whether we can accelerate the numeric component of:

https://github.com/Qiskit/qiskit-terra/blob/main/qiskit/quantum_info/synthesis/two_qubit_decompose.py

and

~https://github.com/Qiskit/qiskit-terra/blob/main/qiskit/quantum_info/synthesis/one_qubit_decompose.py~ (this was done in https://github.com/Qiskit/qiskit-terra/pull/9185)

by moving it to rust. For things that are already in vectorized numpy functions the performance gains may not be that big. The other constraint is that linear algebra functions that depend on blas we'll probably still want to call to numpy (likely via python since the numpy c api doesn't expose a lot of the linalg functions which leverage blas) since we don't want the complexity of linking the rust binary against blas at build time.

I'm not actually sure how much it'll speed things up since I expect a large chunk of time is spent building circuits and manipulating them. Moving the numeric portion of the modules to rust won't really change because the circuit side is in python space and rust can't really accelerate those portion of the modules. This effort is not necessarily going to turn out to be worth it, but we can't really know until we give it a try.

### Tasks
- [ ] https://github.com/Qiskit/qiskit/issues/12008
- [ ] https://github.com/Qiskit/qiskit/issues/12004
- [ ] https://github.com/Qiskit/qiskit/issues/12005
- [ ] https://github.com/Qiskit/qiskit/issues/12006
- [ ] https://github.com/Qiskit/qiskit/issues/12007
mtreinish commented 1 year ago

One thought I did have is that we could do some of the work in parallel if we allow the synthesis function to take a list of unitaries as an input we can process them in parallel and then just loop over the results to build the circuits. Even if the evaluation doesn't get much faster by moving to rust doing multiple things in parallel can still provide a good speedup.

nonhermitian commented 1 year ago

The internals have quite a bit of intermediate numpy array creation and numerical operations. There is likely gains to be had by removing those. Not sure how big that is in the grand scheme of things though

jlapeyre commented 1 year ago

Many of the vectorized numpy functions involve small vectors and matrices, with linear dimensions like 2 or 4. Often the python overhead for numpy calls dominates for these calculations.

mtreinish commented 1 year ago

It's worth pointing out since I originally opened this issue I was able to find some benefit to using Rust for the circuit construction too in the 1q decomposer in: https://github.com/Qiskit/qiskit-terra/pull/9578 and https://github.com/Qiskit/qiskit-terra/pull/9583 so we probably could follow a similar pattern if porting the 2q decomposer too.

mtreinish commented 9 months ago

The importance of this has risen a bit, after #8779 was fixed by https://github.com/Qiskit/qiskit/pull/10365 and #10467 we're definitely bottlenecked on the performance of UnitarySynthesis when using optimization level 3 (along with commutative analysis performance).