Closed anthayes92 closed 3 years ago
Thanks @anthayes92, could you also share the relevant part of the traceback?
I suspect maybe an issue with ExpvalCost
, or something to do with your input parameter shape
Sure thing, here's the traceback:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-1-52d4bb51526b> in <module>
26
27 for _ in range(100):
---> 28 params = optimizer.step(cost_function, params)
~/.local/lib/python3.8/site-packages/pennylane/optimize/qng.py in step(self, qnode, x, recompute_tensor, metric_tensor_fn)
216 array: the new variable values :math:`x^{(t+1)}`
217 """
--> 218 x_out, _ = self.step_and_cost(
219 qnode, x, recompute_tensor=recompute_tensor, metric_tensor_fn=metric_tensor_fn
220 )
~/.local/lib/python3.8/site-packages/pennylane/optimize/qng.py in step_and_cost(self, qnode, x, recompute_tensor, metric_tensor_fn)
188 if metric_tensor_fn is None:
189 # pseudo-inverse metric tensor
--> 190 self.metric_tensor = qml.metric_tensor(qnode, diag_approx=self.diag_approx)(x)
191 else:
192 self.metric_tensor = metric_tensor_fn(x)
~/.local/lib/python3.8/site-packages/pennylane/tape/qnode.py in _metric_tensor_fn(*args, **kwargs)
1013
1014 def _metric_tensor_fn(*args, **kwargs):
-> 1015 jac = qml.math.stack(_get_classical_jacobian(_qnode)(*args, **kwargs))
1016 jac = qml.math.reshape(jac, [_qnode.qtape.num_params, -1])
1017
~/.local/lib/python3.8/site-packages/pennylane/_grad.py in _jacobian_function(*args, **kwargs)
174
175 if len(argnum) == 1:
--> 176 return _jacobian(func, argnum[0])(*args, **kwargs)
177
178 return _np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T
~/.local/lib/python3.8/site-packages/autograd/wrap_util.py in nary_f(*args, **kwargs)
18 else:
19 x = tuple(args[i] for i in argnum)
---> 20 return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
21 return nary_f
22 return nary_operator
~/.local/lib/python3.8/site-packages/autograd/differential_operators.py in jacobian(fun, x)
57 vjp, ans = _make_vjp(fun, x)
58 ans_vspace = vspace(ans)
---> 59 jacobian_shape = ans_vspace.shape + vspace(x).shape
60 grads = map(vjp, ans_vspace.standard_basis())
61 return np.reshape(np.stack(grads), jacobian_shape)
TypeError: can only concatenate tuple (not "list") to tuple
Hey @anthayes92! I think there are two separate things happening here.
Instead of
params = [[0.5]*depth,[0.5]*depth]
you can write
params = np.stack([[0.5]*depth, [0.5]*depth], requires_grad=True)
This fixes the autograd error you are getting in the exception above.
However, that leads us on to the second problem:
This is a bit of subtlety, and is caused by the qml.layer()
function. You have an array of parameters of shape [2, 5]
as input to the QNode, but due to the repetition of layers, the resulting quantum circuit has 60 trainable gate arguments. As a result, the resulting metric tensor will be of size [60, 60]
!
>>> mt = qml.metric_tensor(cost_function)(params)
>>> print(params.shape, mt.shape)
(2, 5) (60, 60)
This will confuse the optimizer, which won't be able to apply the QNG update step due to the shape mismatch.
The reason it has been implemented like this:
The introductory paper https://arxiv.org/abs/1909.02108 does not consider QNG optimization in the context of classical pre-processing of parameters; convergence is only proven when optimizing the gate arguments directly.
Historically, PennyLane also did not allow classical pre-processing inside a QNode, so you were 'blocked' from even attempting to do this in the software.
However, since the new core, (2) is no longer the case, as you have shown in your code example above 🙂
I'm not really sure what to do here. We have three options I can think of off the top of my head.
Leave the behaviour as-is, and raise a more useful exception in the optimizer.
Modify the qml.metric_tensor
function to take into account classical processing between QNode args and gate args.
This is pretty simple to do. If we say that f: R^m -> R^n
is the function representing the transformation from QNode arguments to gate arguments, we could simply compute the Jacobian of this function, and return jac.T @ metric_tensor @ jac
. This is already quite easy in PennyLane:
>>> from pennylane.tape.qnode import _get_classical_jacobian
>>> jac = qml.math.stack(_get_classical_jacobian(_qnode)(*args, **kwargs))
>>> mt = qml.metric_tensor(qnode)(*args, **kwargs)
>>> mt = qml.math.tensordot(mt, jac, axes=[-1, 0])
>>> mt = qml.math.tensordot(jac, mt, axes=[0, 0])
In fact, I've made this change in the branch fix-1154
(feel free to check it out and try locally), and your QAOA example trains very well:
Cost: -0.40057391269073295
Cost: -0.7268916684780963
Cost: -0.9224609566668688
Cost: -1.0434952436963247
Cost: -1.1458920801395698
Cost: -1.2608435077627338
Cost: -1.4016272161590428
Cost: -1.570999017410614
Cost: -1.7649277793607872
Cost: -1.9744020721362818
Cost: -2.1843779694162007
Cost: -2.375999280893928
Cost: -2.534565991142417
Cost: -2.6552873670133157
Cost: -2.743599940207657
Cost: -2.8083248766962265
Cost: -2.856584386208008
Cost: -2.8930529233848223
Cost: -2.9207394714729924
Cost: -2.9417108541537256
Cost: -2.9574920403418647
Cost: -2.9692628260314633
Cost: -2.9779547882892436
Cost: -2.984307487473674
Cost: -2.988902676037116
Cost: -2.992192882000369
Cost: -2.9945215574673965
Cost: -2.996143598489102
Cost: -2.997236836771621
Cost: -2.9979120262799204
My one concern is if this 'hybrid' QNG optimization is guaranteed to converge better than vanilla gradient descent (or even guaranteed to converge at all), since it is not explored in the paper.
E.g.,
@qml.qnode(dev, natural_gradient=True)
def circuit(params):
...
The quantum circuit will return F^{-1} @ quantum_gradient
to the ML library, which will then continue to perform standard backpropagation with the remainder of the classical part of the computation.
This is pretty much equivalent to (2) in practice, but conceptually clearer; the metric tensor is being applied directly to the quantum gradient during backpropagation, rather than to the full hybrid gradient later during optimization. Another advantage is that you can then use any optimizer (e.g., qml.GradientDescentOptimizer
, qml.AdamOptimizer
), and the quantum natural gradient will continue to be used for the quantum components.
After thinking about it more, I'm a fan of solution (3). It is a much more flexible approach, and 'shifts' the burden of applying the metric tensor to the gradient vector away from the optimizer gradient update step, and into the quantum gradient logic.
Instead of QNG being an optimization approach, it is an extension to the quantum gradient.
In other words, instead of thinking of a 'specific QNG optmizer that only applies to purely quantum systems', you can build hybrid models where quantum components provide the quantum natural gradient during backprop. You can even have multiple QNodes in an optimizaton, with one using the quantum natgrad, and another using regular quantum gradients.
Expected behavior: Returns a
list
of updated parametersActual behavior: Returns a
TypeError
System information: Platform info: Linux-5.8.0-45-generic-x86_64-with-glibc2.29 Python version: 3.8.5 Numpy version: 1.19.5 Scipy version: 1.4.1
Source code and tracebacks