getkeops / keops

KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
https://www.kernel-operations.io
MIT License
1.03k stars 65 forks source link

Gradient broadcasting problem with complex on last dimension #266

Closed flbbb closed 1 year ago

flbbb commented 1 year ago

Gradient doesn't seem to broadcast on the last dimension when using complex tensors. This works:

import torch
from pykeops.torch import LazyTensor

H = 100
N = 1000
L = 5000
D = 10

# Real
x_i = LazyTensor(torch.randn(H, N, 1, 1, dtype=torch.float32, requires_grad=True))
y_j = LazyTensor(torch.randn(1, 1, L, D, dtype=torch.float32, requires_grad=True))

D_ij = x_i * y_j

a_i = D_ij.sum(dim=1)
a_i.sum().backward()

This doesn't work:

import torch
from pykeops.torch import LazyTensor

H = 100
N = 1000
L = 5000
D = 10

# Complex
x_i = LazyTensor(torch.randn(H, N, 1, 1, dtype=torch.complex64, requires_grad=True))
y_j = LazyTensor(torch.randn(1, 1, L, D, dtype=torch.complex64, requires_grad=True))

D_ij = x_i * y_j

a_i = D_ij.sum(dim=1)

# Error here
a_i.sum().backward()
ValueError: [KeOps] Error : dimensions are not compatible for Broadcast operation (error at line 233 in file .../lib/python3.8/site-packages/keopscore/formulas/Operation.py)

But this works:

import torch
from pykeops.torch import LazyTensor

H = 100
N = 1000
L = 5000
D = 10

# Complex
x_i = LazyTensor(torch.randn(1, N, 1, D, dtype=torch.complex64, requires_grad=True))
y_j = LazyTensor(torch.randn(H, 1, L, D, dtype=torch.complex64, requires_grad=True))

D_ij = x_i * y_j

a_i = D_ij.sum(dim=1)
a_i.sum().backward()
joanglaunes commented 1 year ago

Hello @Flbk , We are sorry for looking at this several months after. There were indeed bugs in the definition of gradient operations with broadcasting on the last dimension, for multiplication with complex, but also for addition and subtraction with complex and reals. This should be ok now with the commit done in the main branch.