wilson-labs / cola

Compositional Linear Algebra
Apache License 2.0
401 stars 26 forks source link

Adjoint operations move Jacobian from GPU to CPU #88

Closed yashsavani closed 1 month ago

yashsavani commented 6 months ago

🐛 Bug

The adjoint operations in CoLA are moving the Jacobian tensor from the GPU to the CPU, which can lead to performance issues and inconsistencies.

To reproduce

Code snippet to reproduce

import torch
import cola

dev = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

x = torch.randn(100).to(dev)
fn = torch.nn.Sequential(torch.nn.Linear(100, 64), torch.nn.Linear(64, 100)).to(dev)

J = cola.ops.Jacobian(fn, x)
print(J.device, J.T.device, J.H.device, cola.ops.Adjoint(J).device)

Stack trace/error message

cuda:0 cpu cpu cpu

Expected Behavior

Output should look like:

cuda:0 cuda:0 cuda:0 cuda:0

System information

Please complete the following information:

Additional context

Possibly an issue here https://github.com/wilson-labs/cola/blob/main/cola/ops/operators.py#L361 where the device is not being used

AndPotap commented 1 month ago

Thank you for pointing out this bad allocation of devices. Also thank you for the concise and well though code snippet to reproduce. I've just added a fix on #97.