Open chris-n-self opened 3 years ago
Hey Chris! so there is a quick example of how to work around this currently here:
https://github.com/jcmgray/quimb/issues/77#issuecomment-767077816
Ideally yes the TNOptimizer
would take a Circuit
as an possible input. I think this should be pretty easy to implement but haven't got round to it. It would need:
_psi
parsed.Circuit
, before supplying it to loss_fn
etc.I don't see any obstacles to these, and maybe a CircuitOptimizer
using a TNOptimizer
internally is the cleanest outward facing API.
Note that when it comes to autodiff, not all the simplifications methods that Circuit
has turned on by default are compatible (since they depend on the actual tensor values rather than just shapes). This might have a big impact on how efficient the contractions are in comparison to 'fully simplified' versions.
Thanks Johnnie! That's really helpful.
I am finding the optimisation fails for me though when I use that code for my circuit and Hamiltonian. The progress bar displays larger and larger negative numbers and tnopt.res
has fun: nan
and message: b'ABNORMAL_TERMINATION_IN_LNSRCH'
.
My edited version of your example is:
import quimb as qu
import quimb.tensor as qtn
from autoray import do
import itertools
import functools
from scipy.stats import uniform
def circ_qaoa_zzxz(
terms,
depth,
lambdas,
gammas,
betas,
**circuit_opts,
):
""" edited from quimb """
circuit_opts.setdefault('gate_opts', {})
circuit_opts['gate_opts'].setdefault('contract', False)
n = max(itertools.chain.from_iterable(terms)) + 1
gates = []
# layer of hadamards to get into plus state
for i in range(n):
gates.append((0, 'h', i))
for d in range(depth):
for (i, j) in terms:
gates.append((d, 'rzz', -lambdas[d], i, j))
for i in range(n):
gates.append((d, 'rx', gammas[d] * 2, i))
for i in range(n):
gates.append((d, 'rz', betas[d] * 2, i))
circ = qtn.Circuit(n, **circuit_opts)
circ.apply_gates(gates)
return circ
# setup a circuit
n = 4
p = int(n/2)
lambdas = qu.randn(p)
gammas = qu.randn(p)
betas = qu.randn(p)
nn_edges = [ (i,(i+1)%n) for i in range(n) ]
circ = circ_qaoa_zzxz(nn_edges,p,lambdas,gammas,betas,)
# the local operators
J_rands = uniform.rvs(loc=-1,scale=2,size=len(nn_edges))
hX_rands = uniform.rvs(loc=-1,scale=2,size=n)
ops = { nn: Jr * qu.pauli('Z') & qu.pauli('Z') for Jr,nn in zip(J_rands,nn_edges) }
ops.update({ i: -1*hXr * qu.pauli('X') for i,hXr in enumerate(hX_rands) })
# tags describing all the different reverse lightcones for each operators
lightcone_tags = {where: circ.get_reverse_lightcone_tags(where) for where in ops}
# function that takes the input TN and computes the ith loss
def loss_i(psi, where, ops):
tags = lightcone_tags[where]
ket = psi.select(tags, 'any')
bra = ket.H
expec = ket.gate(ops[where], where) | bra
return do('real', expec.contract(all, optimize='auto-hq'))
# since they are a sum we can evaluate them separately
loss_fns = [
functools.partial(loss_i, where=where)
for where in ops
]
tnopt = qtn.TNOptimizer(
circ.psi,
loss_fn=loss_fns,
tags=['RZZ','RX','RZ'],
loss_constants={'ops': ops},
autodiff_backend='jax',
)
tnopt.optimize_basinhopping(n=10, nhop=10)
# -2.577747688523476e+33 [best: -2.577747688523476e+33] : 9%|▉ | 9/100 [01:09<09:22, 6.18s/it]/home/cns08/anaconda3/envs/qaoadf-cotengra/lib/python3.8/site-packages/quimb/tensor/optimize.py:386: RuntimeWarning: invalid value encountered in add
# grads[i] += g_i
# -2.577747688523476e+33 [best: -2.577747688523476e+33] : 9%|▉ | 9/100 [01:12<12:13, 8.06s/it]
# <TensorNetwork1DVector(tensors=32, indices=40, L=4, max_bond=2)>
I think you just need to specify the gates as parametrized:
circ.rx(0.123, 5, parametrize=True)
currently the optimizer takes the raw (and unbounded) entries of the gate tensors (leading to the blowup), whereas when they are parametrized the tensor is defined through the function that generates the unitary gate, so the state should stay normalized.
That's great, thank you! That fixed that problem. Does the gates being parameterised have any other effects?
Not really, rather than storing an array, a parametrized tensor just calls fn(params)
when you access its .data
attribute. I suppose repeatedly retrieving the data might be slightly slower... but it may even be cached, can't remember!
Thanks, that's great. The reason I thought it might work to leave the tensors unparameterised was because of the else of the if isinstance(t, PTensor):
check in the inject_
function in optimise.py:
def inject_(arrays, tn):
for t in tn:
for tag in t.tags:
match = variable_finder.match(tag)
if match is not None:
i = int(match.groups(1)[0])
if isinstance(t, PTensor):
t.params = arrays[i]
else:
t.modify(data=arrays[i])
break
Purely out of interest, would it have also worked to bound the optimisation parameters in some way so that t.modify(data=arrays[i])
stays valid?
Sort of related to this (though this is maybe getting off track from the original issue) I have wondered about how to hack the TNOptimizer class to allow two different parameterised tensors to share an optimisation parameter. Would it work to edit the parse_network_to_backend
function to assign the same variable_tag
to tensors that share one of the "optimise over" tags passed?
def parse_network_to_backend(tn, tags, constant_tags, to_constant):
tn_ag = tn.copy()
variables = []
variable_tag = "__VARIABLE{}__"
for t in tn_ag:
# check if tensor has any of the constant tags
if t.tags & constant_tags:
t.modify(apply=to_constant)
continue
# if tags are specified only optimize those tagged
if tags and not (t.tags & tags):
t.modify(apply=to_constant)
continue
if isinstance(t, PTensor):
data = t.params
else:
data = t.data
# jax doesn't like numpy.ndarray subclasses...
if isinstance(data, qarray):
data = data.A
# append the raw data but mark the corresponding tensor for reinsertion
variables.append(data)
t.add_tag(variable_tag.format(len(variables) - 1))
return tn_ag, variables
Regarding the PTensor
, I'm not sure I totally understand the question but maybe let me elaborate what happens: setting parametrize=True
puts PTensor
instances in the TN. When the optimizer is working out how large the total vector space is, it takes either the raw array size or the params size, so e.g. for the same gate unparam'd/param'd arrays[i]
will be generally different sizes.
Purely out of interest, would it have also worked to bound the optimisation parameters in some way so that t.modify(data=arrays[i]) stays valid?
I think not generally - if you consider e.g. a rotation, the param space is smaller (a single angle) but generates 2x2 complex matrix - there's at least no simple way (i.e. x in [-b ,b]) to bound the entries of the matrix directly.
Regarding linking tensor optimizations, I think it would be pretty much that simple. One just needs a way to cleanly specify that all tags in e.g. optimize_together=[...]
should be... optimized together. That would be a neat addition.
I think not generally - if you consider e.g. a rotation, the param space is smaller (a single angle) but generates 2x2 complex matrix - there's at least no simple way (i.e. x in [-b ,b]) to bound the entries of the matrix directly.
Ah, I think I see the difference now, thanks
Regarding linking tensor optimizations, I think it would be pretty much that simple. One just needs a way to cleanly specify that all tags in e.g.
optimize_together=[...]
should be... optimized together. That would be a neat addition.
If I had a go at implementing this, do you have any preference about what the new kwarg should be called? I was thinking potentially shared_tags
to keep it consistent with the tags
, constant_tags
names. Also, would there then be any tests that need to be updated for this addition?
I would like to use the approach in example 10. Bayesian Optimizing QAOA Circuit Energy (using cotengra contraction orders and the circuit
local_expectation
functions exploiting light-cone cancellations) but also taking advantage of all the auto-differentiation integration in the TNOptimizer class.I had a look at the code in
tensor\optimize.py
&tensor\circuit.py
, and I am not sure how to proceed. My first thoughts would be to do one of:local_expectation
functions into a cost function to be passed to TNOptimizer with just a TN object.psi
attribute of the circuit when neededWould you advise trying any of those? Or is there a better approach you can see? Alternatively, should I forget about trying to use the circuit
local_expectation
functions and just calculate the energy expectation of the TN?