`Module.to(Device)` will cause warnings because non-leaf tensors `.grad` is accessed (Also for optimizers)

dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.

MIT License

1.37k stars 177 forks source link

`Module.to(Device)` will cause warnings because non-leaf tensors `.grad` is accessed (Also for optimizers) #1323

Closed yueyinqiu closed 3 months ago

yueyinqiu commented 4 months ago

https://github.com/dotnet/TorchSharp/issues/1322#issuecomment-2147081809

yueyinqiu commented 4 months ago

That's a bit tricky.

We can't just check param.requires_grad because it might be a non-leaf tensor but retains_grad. I'm also not sure whether there are any other situations.

yueyinqiu commented 4 months ago

Yes there are... We can simply do that in PyTorch:

import torch.nn

x = torch.zeros([], requires_grad=False)
x.grad = torch.zeros([]) + 10
print(x.grad)  # tensor(10.)

So we can never know whether their is a grad before we true access it...

Perhaps the only solution is to add a new C++ method for this. Or find a way to temporarily suppress warnings.

NiklasGustafsson commented 3 months ago

Would it be enough to check tensor.is_leaf before accessing grad in the toEpilog() logic?

NiklasGustafsson commented 3 months ago

I'm confused -- all the instances of .grad in Module are accessing parameters. How can they be non-leaf tensors?

yueyinqiu commented 3 months ago

Hmmm... This issue was created due to #1322 . In that issue, this.conv.weight[..] = nn.Parameter(x.view(1, c1, 1, 1)); was used, which made it a non-leaf tensor.

Well I have just reconsidered about this... I suppose it's a misuse of slicing, and a warning is just what we want in this situation.

NiklasGustafsson commented 3 months ago

So, should #1322 also be closed, then?