Open Balandat opened 4 years ago
It looks like it's because the _quad_form_derivative
is returning the shapes incorrectly. A workaround is just to delete that method from the TriangularLazyTensor
. In your example,
root._quad_form_derivative(torch.randn(4,1), torch.randn(4,1))[0].shape
# torch.Size(4,4)
super(TriangularLazyTensor, root)._quad_form_derivative(torch.randn(4,1), torch.randn(4,1))[0].shape
# torch.Size(2,2,2)
Looking at the source before this commit changed, the difference is that CholLazyTensor
never actually had a _quad_form_derivative
method and used the standard lazy tensor _quad_form_derivative
.
🐛 Bug
An error in gradient shape incompatibility gets triggered when back-propagating through a TLT - tensor product. This was discovered in the context of sampling from a MVN posterior using base samples in https://github.com/pytorch/botorch/issues/513
This issue is caused by the new TriangularLazyTensor from #1102: The following code works fine pre-merge on 3e87f849c56c5ad018ac535052080b6649b244da, but fails on the merge commit 4e6f2d0b0988409f312cdfe97eff0754bc356d48,
I haven't been able to dig much deeper, but the plot thickens.
To reproduce
Stack trace
Expected Behavior
Gradients work.
Additional context
Note that this is not simply a blatant bug in TLT, since the following works fine: