Closed cvjjm closed 2 years ago
Thanks @cvjjm for catching this! Actually, I wonder why our test added in https://github.com/PennyLaneAI/pennylane/pull/1291 didn't catch this.
@chaeyeunpark I'm guessing this was introduced in #2448?
That is indeed surprising that the test from #1291 doesn't fail. I must admit that I didn't have time to check this carefully. Just noticed that some of my code does no longer work with master and that commenting this line fixes things.
I went hunting for the test; since we recently removed old unused tape subclasses, it seems that this test was moved from test_jacobian_tape.py
to test_parameter_shift.py
:
Thanks @josh146 I didn't have time to look into this more but can report that differentiating such non standard observables is also broken in the current master:
File "/.../pennylane/pennylane/_grad.py", line 328, in _jacobian_function
jac = tuple(_jacobian(func, arg)(*args, **kwargs) for arg in _argnum)
File "/.../pennylane/pennylane/_grad.py", line 328, in <genexpr>
jac = tuple(_jacobian(func, arg)(*args, **kwargs) for arg in _argnum)
File "/.../autograd/autograd/wrap_util.py", line 20, in nary_f
return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
File "/.../autograd/autograd/differential_operators.py", line 61, in jacobian
return np.reshape(np.stack(grads), jacobian_shape)
File "/.../autograd/autograd/numpy/numpy_wrapper.py", line 88, in stack
arrays = [array(arr) for arr in arrays]
File "/.../autograd/autograd/numpy/numpy_wrapper.py", line 88, in <listcomp>
arrays = [array(arr) for arr in arrays]
File "/.../autograd/autograd/core.py", line 14, in vjp
def vjp(g): return backward_pass(g, end_node)
File "/.../autograd/autograd/core.py", line 22, in backward_pass
for parent, ingrad in zip(node.parents, ingrads):
File "/.../autograd/autograd/core.py", line 49, in <genexpr>
return lambda g: (vjp(g) for vjp in vjps)
File "/.../autograd/autograd/builtins.py", line 82, in <lambda>
defvjp_argnum(make_sequence, lambda argnum, *args: lambda g: g[argnum - 1])
IndexError: list index out of range
Both were still working with v0.22.2
.
The root cause of this turned out to be something entirely different :-) The device with which I discovered the problem also provides a device gradient and this is what is broken now. Surprisingly, just declaring that a device provides a gradient now makes it unusable, even for evaluating, say, paramerter shift gradients of QNodes as this minimal example shows (notice how I explicitely ask for diff_method='parameter-shift', mode="backward"
and indeed the device .jacobian()
is never called):
import pennylane as qml
from pennylane.devices import DefaultQubit
from pennylane import numpy as np
class MyQubit(DefaultQubit):
@classmethod
def capabilities(cls):
capabilities = super().capabilities().copy()
capabilities.update(
provides_jacobian=True, # with this commented out everything works
)
return capabilities
def jacobian(self, *args, **kwargs):
raise NotImplementedError()
dev = MyQubit(wires=2)
@qml.qnode(dev, diff_method='parameter-shift', mode="backward")
def qnode(params):
qml.RY(params[0], wires=[0])
return qml.expval(qml.PauliZ(0))
params = np.array([0.2])
print(qnode(params))
print(qml.jacobian(qnode)(params))
which results in:
0.9800665778412417
asefaefafeawef [0] <QNode: wires=2, device='default.qubit', interface='autograd', diff_method='parameter-shift'> (tensor([0.2], requires_grad=True),) {}
Traceback (most recent call last):
File "pl_test9.py", line 27, in <module>
print(qml.jacobian(qnode)(params))
File "/.../pennylane/pennylane/_grad.py", line 329, in _jacobian_function
jac = tuple(_jacobian(func, arg)(*args, **kwargs) for arg in _argnum)
File "/.../pennylane/pennylane/_grad.py", line 329, in <genexpr>
jac = tuple(_jacobian(func, arg)(*args, **kwargs) for arg in _argnum)
File "/.../autograd/autograd/wrap_util.py", line 20, in nary_f
return unary_operator(unary_f, x, *nary_op_args, **nary_op_kwargs)
File "/.../autograd/autograd/differential_operators.py", line 61, in jacobian
return np.reshape(np.stack(grads), jacobian_shape)
File "/.../autograd/autograd/numpy/numpy_wrapper.py", line 88, in stack
arrays = [array(arr) for arr in arrays]
File "/.../autograd/autograd/numpy/numpy_wrapper.py", line 88, in <listcomp>
arrays = [array(arr) for arr in arrays]
File "/.../autograd/autograd/core.py", line 14, in vjp
def vjp(g): return backward_pass(g, end_node)
File "/.../autograd/autograd/core.py", line 23, in backward_pass
outgrads[parent] = add_outgrads(outgrads.get(parent), ingrad)
File "/.../autograd/autograd/core.py", line 176, in add_outgrads
return sparse_add(vspace(g), None, g), True
File "/.../autograd/autograd/tracer.py", line 48, in f_wrapped
return f_raw(*args, **kwargs)
File "/.../autograd/autograd/core.py", line 186, in sparse_add
return x_new.mut_add(x_prev)
File "/.../autograd/autograd/numpy/numpy_vjps.py", line 698, in mut_add
onp.add.at(A, idx, x)
ValueError: array is not broadcastable to correct shape
Thanks @cvjjm, this is helpful, but also troubling 😬
File "/.../autograd/autograd/numpy/numpy_vjps.py", line 698, in mut_add onp.add.at(A, idx, x) ValueError: array is not broadcastable to correct shape
I am very familiar with this traceback, and it always scares me since it is very hard to debug; it's an indication that a broadcasting rule we are using --- that is permitted on the forward pass --- is breaking on the autograd backwards pass 😬 And the traceback doesn't indicate where in the forward pass the issue is!
Luckily it is still working with a fairly recent version, so should be possible to find the problem by bisecting.
Maybe the above example can become a test to catch such problems in the future.
@mlxd tracked this down to https://github.com/PennyLaneAI/pennylane/commit/6d1ddf664531507fa35973c0ebb3d709ada7afc5 (bug fixes we made during the last release-candidate cycle!)
Hi @cvjjm, this seems to boil down to the change we chatted about around the release of v0.23.0. It seems like the logic added there did start causing issues after all :crying_cat_face:
In specific, a new if
statement was added here:
if device.capabilities().get("provides_jacobian", False):
# in the case where the device provides the jacobian,
# the output of grad_fn must be wrapped in a tuple in
# order to match the input parameters to _execute.
return (return_vjps,)
return return_vjps
This was required to help with a device we have in the PennyLane-SF plugin. It seems, however, that as you suggest, it breaks other cases.
Removing the if statement makes the listed example to execute well.
We could definitely make a fix to this. I'm curious: how severe would this bug be in its current form? I.e., would it bring value to make a bug fix release such that it's pip
installable, or would having it with v0.24.0 be a good target (~21th of June)?
Oh! I guess my take home form the above is then:
1) Checking whether the device declares provides_jacobian=True
is not a good indicator for whether the jacobian actually came from the device. I guess the wrapping in a tuple should at least be moved to inside this else statement:
https://github.com/PennyLaneAI/pennylane/blob/bcd837b1ed1187895c327abfe62aea71fbeba02f/pennylane/interfaces/autograd.py#L220-L232
2) Is there a more robust way of checking whether the jacobian actually came from a device and thus needs to be wrapped? Maybe the shape or type of the jacobian can be used to decide whether wrapping is needed? This would even allow some room for error on the plugin developer side, which would be nice.
3) There needs to be a test in core PL with at least a "fake" device that implements a "fake" device jacobian feature (like returning a fixed number) so that such things can be caught. Currently there is no device that ships with PL that implements the device gradient, isn't there?
I would highly appreciate a 0.23.X bugfix release, not because I cannot wait until the 21st but because it is nice having at least some v0.23.X version that I do not need to black list because of this :-)
Just as a reminder: The original issue from https://github.com/PennyLaneAI/pennylane/issues/2576#issue-1237311914 is also still open.
Thanks @josh146 for hunting down the test that was meant to preserve the feature of non standard return types!
Looking at the test, it is fairly clear that the reason for why this was not triggered is that someone simply re-declared the R_DTYPE
of the device to be SpecialObsevable
:
https://github.com/PennyLaneAI/pennylane/blob/1a13bef8adfa85a86a7567f41294b3f73285a294/tests/gradients/test_parameter_shift.py#L1072-L1074
This makes the test pass but then means that the device can only ever compute QNode
s that return a SpecialObsevable
.
What I need though is a device that supports standard observables and my special observable. Probably the test can/should be expanded to also have the device evaluate a QNode
returning a normal observable.
Hi @cvjjm, thanks for the details! :slightly_smiling_face: We're going forward with the fix for the device jacobian (see https://github.com/PennyLaneAI/pennylane/pull/2595). After chatting internally, we'd target v0.24.0
to have it in. Will also consider the other issues here.
Hi, as far as I can tell the test is still in the "vandalized" state in the current master. Is there a chance we can get this functionality restored in the next release?
Hi @cvjjm, apologies that the progress here has slowed down on the original issue, we can definitely have this in for the next release (v0.25.0). Thanks for the reminder! :+1:
Hi @cvjjm, I created a pr implementing the custom object fix, plus re-adding a testcase that seems to have disappeared from pl . But this kind of solution I don't think would work in the case of jax, torch, tf, since in general we register a custom primitive with each of these frameworks (representing the circuit) that must return a tensor. The only solution I can think is some packing and unpacking mechanism, similar to jax pytrees.
Thanks! That fix looks perfect and, yes, I agree that this will only work with autograd/numpy, but that is fine. There is an additional change in master that also breaks special return types and which will show up when you merge master into the fix branch. See my comment over there
It seems that this "feature" is now broken in the latest master.
More precisely the line here https://github.com/PennyLaneAI/pennylane/blob/1a13bef8adfa85a86a7567f41294b3f73285a294/pennylane/_qubit_device.py#L299 makes it impossible to have an observable that returns something that is not castable to float.
Would it be possible to have this line in a try/catch block and return either the plain result or a
dtype=object
array in caseresult
cannot be cast to float?