Design interface for optimization of parametrized circuits

GTorlai / PastaQ.jl

Package for Simulation, Tomography and Analysis of Quantum Computers

Apache License 2.0

142 stars 23 forks source link

Design interface for optimization of parametrized circuits #224

Closed GTorlai closed 2 years ago

GTorlai commented 2 years ago

We should finalize the interface for optimizing parametrized circuits. Two near-term use cases are the minimization of a given MPO (e.g. VQE), and the maximization of fidelity (e.g. optimal coherent control). In both cases the optimization reduces to evaluate environments and propagate the gradients through the contractions.

mtfishman commented 2 years ago

As part of the PEPS gradient optimization code in https://github.com/ITensor/ITensorNetworkAD.jl we are thinking about general strategies for approximate gradient optimization of tensor networks, maybe would be of interest here as well (though it seems like in the circuit case, the approximate gradients can be computed in a straightforward way with runcircuit so a general strategy may not be necessary).

GTorlai commented 2 years ago

I think we could incorporate the AD part to compute the gradients of the tensor that are removed from the network wrt the tensor parameters. Otherwise those derivatives have to be implemented manually.

mtfishman commented 2 years ago

Right, that is definitely a step where we will want to use AD.

GTorlai commented 2 years ago

Can we already plug in the ITensorNetworkAD.jl for this use case?

mtfishman commented 2 years ago

We can probably just use the simpler package https://github.com/mtfishman/ITensorsGrad.jl for that more limited usage (like differentiating through constructing an ITensor from an Array). ITensorNetworkAD.jl builds on top of that functionality for network level AD.

I think actually the hardest part will be doing AD through all of our data structures that we use to make the gate list (like we have to AD through making the gate list and then making the ITensor gates from the gate list, where many of those operations are probably not differentiable), so we will have to think about how to do that correctly.

GTorlai commented 2 years ago

I was thinking something much simpler. As in just using AD at the lowest level of the AD graph, and using exact gradients on anything above that. Eventually we will want to just use AD on end-to-end calculation.

mtfishman commented 2 years ago

Gotcha, maybe there is a simpler way to do it than the one I am picturing.