Open AdrianPerezSalinas opened 3 years ago
@AdrianPerezSalinas thanks for this issue. I think this is feasible, by computing the analytic gradients manually or automatically (via tensorflow, or other backends).
Do you think automatic gradients are compatible with the finite differences techniques?
Usually, automatic analytic gradients are more efficient than finite differences, which are already computed by the scipy minimize.
I think I did not explain myself properly. When I say finite differences for quantum circuits I am talking about a method that allows to compute the exact analytical gradient by shifting the values of the parameters a large quantity (for most operators this quantity is pi/2). This is robust against inherent statistical noise, so it is useful for the experiment. See Eqs. 13 and 14 from here: https://arxiv.org/pdf/1811.11184.pdf
Thanks for the clarification, sorry for the misunderstanding. Yes, this is something which may help, and for sure interesting to have build-in.
Nice! We can discuss it tomorrow
Hi @scarrazza , I leave here a document explaining my point with more details. Hope you find it useful! main.pdf
Hi @scarrazza , I have been testing the method I proposed to you. I have not done so much, but results are pretty promising. I have done a test for the standard VQE with small circuits, as states in QIBO's examples. If I compute the expected value of the hamiltonian exactly (0 shots), both methods return the same optimization path VQE_0shots.pdf
However, if I do the same with measurements (approximation is kind of rough), optimization is really different. It gets stuck quickly since the derivatives do not give any information. VQE_10000000.0shots.pdf
In addition it looks like it does not matter how many shots you perform to estimate the hamiltonian, results are not good
Today I will check the next steps
Ok, thanks for this tests, lets see.
I have tested a fitting problem like qPDF, results are comparable fit_0shots.pdf fit_10000shots.pdf
Same test for an easy classifier, classifier_0shots.pdf classifier_10000shots.pdf
I think that it is clear, we need to implement this functionality when applying optimization to measurements. In addition, at least for measurement simulations, it is way faster to do it in this new way.
Do you want me to show you the code?
Hi @AdrianPerezSalinas , just wondering if you are aware of this paper by Simon Benjamin on "Quantum Analytic Descent": https://arxiv.org/abs/2008.13774
I was not aware of it, but thank you very much for pointing it out!
I have taken a look at it and it sounds really interesting. However, after the discussion today I think that the exact derivatives method or this one can reduce the error due to sampling, but cannot deal with imperfect circuits. We will have to further investigate about it
I leave here the results of some tests made with the exact derivative and a VQE model. As you may see, with errors of order 0.1% we can still have some minimization. Noisier circuits are chaotic.
I think that implementing this kind of exact derivatives could be interesting only if we know that the circuit noise is below a certain threshold.
OK and how this compares to the numerical derivative for similar configurations?
Nothing returns result extremely good, but exact derivatives are more resilient to noise and errors than numerical ones. Not too much, though
VQE_0shots_0.01.pdf VQE_0shots_0.001.pdf VQE_10000shots_0.01.pdf VQE_10000shots_0.001.pdf
Hi everyone, have a Happy New Year.
We are most of the time using variational circuits that depend on some tunable parameters. We have been using scipy methods to find the optimal parameters, and recently the parallel L-BFGS-B method was added to the repository. My proposal is to extend this method to the case where the gradient of the function is given for optimizing. I think this is useful for two main reasons:
I have been looking to the code and saw that the core of the computation is somehow delegated to the standard scipy recipe
with self.mp.Pool(processes=self.processes) as self.pool:
from scipy.optimize import minimize
out = minimize(fun=self.fun, x0=x0, jac=self.jac, method='L-BFGS-B', bounds=self.bounds, callback=self.callback, options=self.options)
Thus, it should be easy to implement what I ask for. The first step should be allowing the use of the keyword
fprime
which is already defined for this function. This is trivial, I think. The second step should be including the gradient function. However, this could be left to the user to pass as an argument of the code.What do you think?