ronandrevon / ed_qc

0 stars 0 forks source link

Variational quantum eigen solver #1

Open ronandrevon opened 2 years ago

ronandrevon commented 2 years ago

I understand that the VQE computes the expectation value of a Hamiltonian operator (or Rayleigh Ritz quotient) according to the standard variational Ritz method. The trial states are prepared by the reconfigurable circuit by assigning values to phase shifters according to some parameters fed by a classical optimizer. Each module of the quantum processor actually computes the expectation value of a tensor product of Pauli operators, the Hamiltonian being decomposed as a sum of such products.

Similar to QPE, the computational quantum advantage boils down to the representation of the quantum state, which for a $n$ q-bit system may be represented in a superposition of $2^n$ basis states. A classical computer working on the $2^n$ amplitude coefficients need $O\bigl((2^n)$ floating operations to evaluate an expectation value. On the other hand, the quantum processor directly operates on the q-bit state for which $O(1/p^2)$ repeated measurements are necessary to get an expectation value with precision $p$. Considering that the minimizer scales linearly with the number of parameters (Nelder-Mead requiring $N+1$ evaluations at each update step), the overall quantum complexity is $O\bigl(n^r M |h^2|p^{-2}\bigr)$ in the quantum case against $O\bigl(M 2^n (2^n)^r \bigr)$ for classical case.

This brings me to some questions with respect to the parametrization of the state :

adamcallison commented 2 years ago

I understand that the VQE computes the expectation value of a Hamiltonian operator (or Rayleigh Ritz quotient)

Yes, and these are equivalent, as the state is normalised.

according to the standard variational Ritz method. The trial states are prepared by the reconfigurable circuit by assigning values to phase shifters

The notion of a "phase shifter" is specific to photonic chips like the one in the Peruzzo paper. Other devices will implement different gates, but it is usually the case that only the single-qubit gates are parameterised (here, the phase shifters), while the entangling gates are fixed (here, the CNOT, which I think is implemented through $d_6$ to $d_8$).

according to some parameters fed by a classical optimizer. Each module of the quantum processor actually computes the expectation value of a tensor product of Pauli operators, the Hamiltonian being decomposed as a sum of such products.

Yes. They don't say it explicitly, but it seems likely that, for this experiment, each module is just a separate set of runs on the same chip ( but measuring a different Pauli operators).

Similar to QPE, the computational quantum advantage boils down to the representation of the quantum state, which for a q-bit system may be represented in a superposition of basis states. A classical computer working on the $2^n$ amplitude coefficients need $O(2^n)$ floating operations to evaluate an expectation value.

The source of a quantum advantage is fairly deep philosophical issue, but I don't think it's right to say that quantum speedup is entirely explained by the state representation. For many problems, there is not expected to be any quantum speedup, regardless of how you encode your input. For QPE, the specific, well-crafted algorithm gives you the speedup (although it does of course require the input state to be given as an actual quantum state. If there are advantages in VQE, the discussion of why is still very much open.

On the other hand, the quantum processor directly operates on the q-bit state for which $O(1/p^2)$ repeated measurements are necessary to get an expectation value with precision $p$.

Yes, although perhaps it's worth noting that for a specific set of parameters, performing the repeats can be done quite quickly.

Considering that the minimizer scales linearly with the number of parameters (Nelder-Mead requiring $N+1$ evaluations at each update step), the overall quantum complexity is $O(n^r M |h^2| p^{-2})$ in the quantum case against $O(M2^n(2^n)^r)$ for classical case.

I've never really considered the big-O notation scalings for VQEs. I'm not sure what $M$ and $r$ are here, but I can believe what you've written. However, it's worth nothing that there's usually no formal guarantee of correct outputs for VQEs.

It's also worth noting that, although they've chosen Nelder-Mead here, other classical optimization methods could be used. Since repetitions are being used to estimate expectation values, the so-called "shot noise" that arises from not operating in the infinite-shot limit can sometimes be harmful for gradient-based optimizers. Gradient-free optimizers are sometimes considered, add are stochastic gradient-based optimizers. One that I see a lot in VQE papers is called Adam (no relation).

  • I wonder if $n$ is the actual number of q-bit? The prototype works with 2-qbit but 6 parameters are being optimized. I can see from the circuit that the number of physical gates should scale linearly with the number of qbits. This is provided that the number of phase shifters per qbit does not scale with larger systems. Then the number of optimized parameters should indeed scale linearly with the number of qbits.

I don't have much to say here, other than that I agree it's not clear. Ultimately, to have good results from this algorithm, the parameterised circuit should be sufficiently expressible; that is, by varying the parameters, you should be able to generate output states (pre-measurement) that cover much of the Hilbert space (or at least some relevant part of the Hilbert space), so that it's likely to be able to express a state that is reasonably close to the target ground-state. Given that adding more qubits exponentially increases the size of the Hilbert space, it seems intuitive that you would need to increase the number of parameterised gates per qubit (and perhaps also the number of fixed CNOT gates) to maintain expressibility.

This brings me to some questions with respect to the parametrization of the state :

  • In the supplementary information, the proposed mapping uses 6 parameters to represent a 2 q-bit state. Do you know (or have references about) how this parametrization evolve for a 3,4,5 q-bit system represented by 8,16,32 amplitudes respectively?

No. By simple counting arguments, one in principle needs $2(2^n - 1)$ real-valued degrees of freedom to be able to express the entire Hilbert space of $n$ qubits. This is because there are $2^n$ complex-valued amplitudes in a (pure) quantum state on $n$ qubits, but one-degree of freedom is fixed by normalisation and another constitutes an irrelevant global phase. This happens to much up with 6 parameters for 2 qubits, but I doubt this is why they chose 6.

  • The coupled cluster is mentioned in the paper and references as a special ansatz intractable classically. Is this ansatz somehow used and related to this mapping?

I don't think so. The (unitary) coupled cluster ansatz is an example of a "problem-inspired" ansatz circuit that is particularly suited to quantum chemistry problems. It's thought that this should make the classical optimization easier for some problems, perhaps because states in Hilbert space that are of the right form to be quantum-chemistry ground-states will be closer together in parameter space. By preparing a suitable initial state (typically a classical approximation of the ground-state that ignores quantum correlations), you ansatz is likely to prepare the right sort of states,, and the classical optimization is easier.

Unfortunately, problem-inspired ansatze are often difficult to implement in current devices, so a lot of research looks at "hardware-inspired" anstze instead. This essentially just means building a circuit that the device can easily perform, as has been done here. This approach can lead to the problem of "barren plateaus", where there are regions of parameter space where the gradients become very small, and the optimizer can get lost. Mitigating this effect is an active topic of research.

  • 6 parameters are mapped to 8 physical components, the phase shifters, among which 2 are used to account for the tensor product of Pauli operators (in the supplementary information $\phi_5 $ and $\phi_6 $ are set to $\pi $ to actually represent the identity operator). Do you know how the mapping changes for the different tensor products $\sigma_i $ and $\sigma_i \otimes \sigma_j $ ? Would higher order tensor products require more channels and phase shifters?

I don't understand their mapping well enough to talk about the required phases for the measurements. Setting $\phi_5 $ and $\phi_6 $ to $\pi $ is done analytically in their calculation in order to find the mapping between the state parameters and the circuit parameters.

In fact, it seems that actually 4 of the 8 shifters are used to set the measurement basis; $\phi{5} $ and $\phi{6} $ are used exclusively for this, while $\phi_7 $ and $\phi_8 $ are given parameters that combine state preparation settings and measurement settings. The key point is that the physical measurement is only performed in one basis, probably the $\sigma_Z $ basis. If so, then to do a logical measurement in the $\sigma_Z $ basis, the "measurement phases" on the relevant qubit would be set to $\pi $ so that the identity is performed. If a logical measurement is required in the $\sigma_X $ basis, then I'm guessing the measurement phases would be set to $\pi/2 $ which may perform a Hadamard gate. These can be combined to produce measurements of tensor-products of Pauli operators.

To produce higher order products, you would need more qubits for them to be acting on, so yes you would need more channels. However, sometimes Hamiltonians on more than 2 qubits can just be written with one- and two-body terms. In these cases, it's possible to schedule your measurements so that multiple compatible measurement operators can be performed on the same run, in order to reduce the total number of runs.

ronandrevon commented 2 years ago

In the suppl info $M$ is simply the number of terms in the Hamiltonian (so I guess this proves each term is computed one after the other as you said as opposed to what one might think looking at fig1). $r$ is an exponent which depends on the optimizer. I thought that a crucial part of the bigO complexity scaling comes from the optimizer having only around $n$ parameters to work with in VQE against $2^n$ in full classical. But we have come to agree this aspect is a bit speculative at this stage since in this example 6 parameters are used for a 2 q-bit state

I have had a look at this paper : Kandala, A., Mezzacapo, A., Temme, K., Takita, M., Brink, M., Chow, J. M., & Gambetta, J. M. (2017). Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature, 549(7671), 242–246. which seems to go in the direction you are mentioning but this is way out of my depth.

The quantum optics circuit implementation is certainly the most approachable for me and I find this aspect of the paper quite interesting. I am getting hold of some of the textbook referenced in the paper available at our library. Apart from these would you know a good online reference where I can get a detailed explanation of this aspect of the paper.

Since VQE solves a variational problem, it is somewhat similar to the type of problems solved in quantum annealing. It is apparently not restricted to quadratic forms thanks to the possibility to evaluate higher order terms through high order pauli tensor products. I wonder if this is also possible with QA ?

In the simple approach of electron diffraction(ED) simulation, we are basically trying to find the eigen vector $\Psi$ of a 1-electron Hamiltonian corresponding to a known eigen value. This seems to be possible with QA from this paper : Criado, J. C., & Spannowsky, M. (2022). Qade: Solving Differential Equations on Quantum Annealers but I am sceptical about the scaling behaviour. Do you think there could be an advantage using VQE with the same formulation ?

Ultimately we want to solve the crystallographic ED problem, i.e. finding the electrostatic potential that would give far field intensities $|\Psi|^2$ as faithful as possible to the measured intensities. This paper Donatelli, J. J., & Spence, J. C. H. (2020). Inversion of Many-Beam Bragg Intensities for Phasing by Iterated Projections: Removal of Multiple Scattering Artifacts from Diffraction Data. Physical Review Letters, 125(6), 65502. formulates it as a minimization problem but I have not implemented it yet classically.

That is probably very naive, but I was wondering : if we assume that we know the eigen vector $\Psi$ (we actually only know $|\Psi|^2$) and eigen value(200keV) from experiment, could we evaluate the Schrodinger's Hamiltonian for that eigen vector but the input state represent some parameters for the electrostatic potential that we want to optimize ? The parameter search being very large, we could take then full advantage of the state representation(although I understand this is not necessarily always the origin of a quantum advantage).

adamcallison commented 2 years ago

In the suppl info $M$ is simply the number of terms in the Hamiltonian (so I guess this proves each term is computed one after the other as you said as opposed to what one might think looking at fig1). $r$ is an exponent which depends on the optimizer. I thought that a crucial part of the bigO complexity scaling comes from the optimizer having only around $n$ parameters to work with in VQE against $2^n$ in full classical. But we have come to agree this aspect is a bit speculative at this stage since in this example 6 parameters are used for a 2 q-bit state

Yes, it's quite speculative, but also quite implementation dependent. Good ansatze would lead to good expressibility over feasible states (especially if the first steps are to prepare some good classical approximation, such as a Hartree-Fock state as in the unitary coupled-cluster ansatz approach) with fewer parameters.

I have had a look at this paper : Kandala, A., Mezzacapo, A., Temme, K., Takita, M., Brink, M., Chow, J. M., & Gambetta, J. M. (2017). Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature, 549(7671), 242–246. which seems to go in the direction you are mentioning but this is way out of my depth.

This article https://materialstheory.springeropen.com/articles/10.1186/s41313-021-00032-6 gives a review of approaches to ansatz design. I've never read this paper in detail, but if you decide to look at it I'm happy to delve in if you have any questions.

The quantum optics circuit implementation is certainly the most approachable for me and I find this aspect of the paper quite interesting. I am getting hold of some of the textbook referenced in the paper available at our library. Apart from these would you know a good online reference where I can get a detailed explanation of this aspect of the paper.

What aspect do you mean? Setting the phases to incorporate the correct measurements? It seems to be that it would be specific to the device, so I doubt it; it's really something they should have explained better in their suppl. material. I can try to figure out how they would have set them. I can imagine how they would do Pauli-X operators but but Pauli-Y seem like the would be trickier. Different devices would implement their measurements in different ways.

Since VQE solves a variational problem, it is somewhat similar to the type of problems solved in quantum annealing. It is apparently not restricted to quadratic forms thanks to the possibility to evaluate higher order terms through high order pauli tensor products. I wonder if this is also possible with QA?

The reason VQE can handle Hamiltonians with higher-order Pauli tensor products is because they only enter the algorithm through measurement. It is possible to infer the measurement outcome of a tensor product of Paulis by just performing the relevant single Pauli measurements and combining the results. This probability distribution of these inferred outcomes will be the same as if you had performed the actual higher-order measurement. The post-measurement state will be different, because the successive measurements collapse the state differently, but since the post-measurement state here is just being discarded, it doesn't matter. It is possible to perform the higher order measurements more directly, but it would require additional gates and some ancillary qubits to do it (e.g for measuring $Z_1 \otimes Z_2$ you need to transfer the parity information to an ancillary qubit, I think by a pair of CNOT gates that uses qubits 1 and 2 as controls and the ancillary qubit as a target).

In QA, the Hamiltonian should be directly implemented on the device, and it's difficult to physically create interactions of a higher order than quadratic. There are ways to compile higher-order Hamiltonians to quadratic ones, but this will also require many ancillas.

As an aside, I was recently thinking about the idea of using quantum annealers in a more variational ways, i.e by treating the native interactions as parameters in "quantum annealing ansatz" and performing the variational minimisation. I don't know of any work that does this, but there is one company that I suspect might be working on it behind an NDA...

In the simple approach of electron diffraction(ED) simulation, we are basically trying to find the eigen vector $\Psi$ of a 1-electron Hamiltonian corresponding to a known eigen value. This seems to be possible with QA from this paper : Criado, J. C., & Spannowsky, M. (2022). Qade: Solving Differential Equations on Quantum Annealers but I am sceptical about the scaling behaviour. Do you think there could be an advantage using VQE with the same formulation ?

Which aspect of scaling are you worried about? In any case, in my view, asymptotic complexity is often over-emphasised. In practice, even classically, we use some algorithms because they are fast and reliable in practice, rather than because of asymptotic complexity guarantees. In general, it's worth pursuing quantum algorithm ideas even if we might not think there will be a provable guarantee (as long as it's not doing something very obviously inefficient).

To answer the question, I think it is possible that VQEs could be useful in principle (if they are going to be useful for anything) with this formulation. However, if I understand correctly, this formulation doesn't get round the problem of the large classical representation of the quantum state: the eigenstates of the Ising mode formulation are essentially bit-strings that encode the classical representation of the function being integrated (in your case this would be the quantum state?), and the physical quantum state is some superposition over classical representations of the function (so, over classical representations of quantum states). This is different from the normal VQE for quantum chemistry, where the goal is to put the qubits in the physical quantum state that's being searched for. Any advantage, in either QA or VQE, must here come from the possibly-enchanced exploration capabilities of quantum annealing, compared to classical minimizers.

Ultimately we want to solve the crystallographic ED problem, i.e. finding the electrostatic potential that would give far field intensities $|\Psi|^2$ as faithful as possible to the measured intensities. This paper Donatelli, J. J., & Spence, J. C. H. (2020). Inversion of Many-Beam Bragg Intensities for Phasing by Iterated Projections: Removal of Multiple Scattering Artifacts from Diffraction Data. Physical Review Letters, 125(6), 65502. formulates it as a minimization problem but I have not implemented it yet classically.

It sounds interesting, I will try to read the paper.

That is probably very naive, but I was wondering : if we assume that we know the eigen vector [something went wrong with formatting, I think you are saying you would know $|\Psi|^2$] and eigen value(200keV) from experiment, could we evaluate the Schrodinger's Hamiltonian for that eigen vector but the input state represent some parameters for the electrostatic potential that we want to optimize ? The parameter search being very large, we could take then full advantage of the state representation(although I understand this is not necessarily always the origin of a quantum advantage).

I don't quite understand. Are you suggesting to encode the electrostatic potential parameters as a quantum state?

ronandrevon commented 2 years ago

yes I agree, this is the reason I am not sure this formulation was very promising(that I expressed as the scaling behaviour, both storage and comptational). it feels like the number q-bits required is quickly getting overwhelming as it encodes the state of the unknown eigen vector in a way that is too straightforward.

Yes that's where I see a challenge in applying quantum processing to ED : The electrostatic potential may contain a large number of parameters but we only to solve a 1-electron Hamiltonian whose eigen state is usually not even all that big.

I see for QA I understand that the device can allow the state to avoid local minima thanks to quantum mechanisms such as quantum tunnelling which can get the state through a high but thin barrier. On the other hand a global classical minimizer (I have had experience with a genetic NSGA-II algorithm) has to work with parameters which in large dimensions restrict the search space.

Thank you I will have a look at the paper.

Yes somehow since this is the one that may be intractable classically but then it probably makes more sense to have a formulation that uses QA. I am still not familiar enough with quantum algorithm to come up with a proper formulation for that.