Closed hbgtjxzbbx closed 2 years ago
Hi @hbgtjxzbbx ,
Thanks for your important bug report! As of today, a simple workaround is to proceed as follows:
from pykeops.torch import LazyTensor
import torch
a = torch.rand(2, 1000, 5)
a.requires_grad = True
b = torch.rand(2, 1000, 5)
c1 = torch.rand(2, 1000,5)
c2 = torch.rand(2, 1000,5)
a_i = LazyTensor(a[:,:,None])
b_j = LazyTensor(b[:,None])
dist = a_i.sqdist(b_j)
kernel = dist.exp()
class ContiguousBackward(torch.autograd.Function):
@staticmethod
def forward(ctx, input):
return input
@staticmethod
def backward(ctx, grad_output):
return grad_output.contiguous()
# Test case 1
d1 = ContiguousBackward.apply(kernel @ c1)
d_permute = d1.permute(0,2,1).mean().backward()
# Test case 2
d1 = ContiguousBackward.apply(kernel @ c1)
d2 = ContiguousBackward.apply(kernel @ c2)
d_cat = torch.cat([d1,d2],2)
d_cat.mean().backward()
The ContiguousBackward
wrapper ensures that all tensors that are passed to the backward KeOps function are indeed contiguous.
Going forward, I believe that we could adopt this behavior by default... Back in 2019, we opted against silently casting the input to KeOps routines as contiguous tensors. The idea was to make users aware of the (small) cost of these transposition operations, thus allowing them to best profile their programs.
With hindsight, however, we were probably over-thinking the problem: the overhead of the .contiguous()
operation is negligible in most (all?) settings.
What do you think @bcharlier, @joanglaunes ?
Best regards, And see you soon! Jean
Jean, Thanks for fixing this :)
Hi all,
to be more specific:
The numpy binder silently cast into c_style
array (meaning that you pay a copy when input tensor are not contiguous... to make it contiguous) at this line : https://github.com/getkeops/keops/blob/92ba03e22e87e56a7e626f128edd4ee44a8ce2f3/pykeops/numpy/generic/generic_red.cpp#L44
This is not the case for the torch bindings. An error is thrown https://github.com/getkeops/keops/blob/92ba03e22e87e56a7e626f128edd4ee44a8ce2f3/pykeops/torch/generic/generic_red.cpp#L31
Hi @bcharlier ,
Thanks for the details :-) Would you be fine with just casting all tensors as contiguous prior to the generic reductions? If tensors are already contiguous, this would do nothing; and otherwise, in my experience, the cost of the "transpose" is always negligible anyway. Do you see a situation where this could be a problem?
See you soon, Jean
In fact, you wil pay a copy (not a simple view
change) of the non-contiguous variable. Usually, this is not a major problem but when M
or N
are large, it can be an issue (in term of memory size at least).
That being said, if 99,9% of the user apply the .contiguous()
method by hand to fix that, we can do it automatically anyway :)
Hello @hbgtjxzbbx ,
We decided to apply a .contiguous()
for all non contiguous input tensors in the PyTorch bindings, with a warning in the forward pass (since it causes an extra copy that the user may want to avoid), and silently in the backward pass. The update has been merged into master so I am closing this issue now. Feel free to re-open if needed.
Hello :) Thanks for the great library ! I met contiguous issues during backward call. Any suggestion would be super helpful, thanks! One is cause by "permute" operation after kernel reduction and the other is caused by "cat" two kernel reduction results.
both output:
Here is the example code