dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.
MIT License
1.37k stars 177 forks source link

How do I turn off these warnings? #1322

Closed lintao185 closed 3 months ago

lintao185 commented 4 months ago

image Excuse me, how do I turn off these warnings?

yueyinqiu commented 4 months ago

may related to #1307

Hmm... I've checked it again. PyTorch is checking those things in the Python codes, and warning with warnings package. But why LibTorch is also doing that at C++ level and warning with c10? That's pretty strange...

By the way, in most cases, you should resolve those warnings, instead of just ignoring or suppressing them.

yueyinqiu commented 4 months ago

Actually we can set the warning handler of c10 to redirect or 'catch' the warnings. But what should happen after the warnings have been catched? C# does not have a standard behavior for runtime warnings...

Is it acceptable to introduce Microsoft.Extensions.Logging or something else? @NiklasGustafsson

lintao185 commented 4 months ago

may related to #1307

Hmm... I've checked it again. PyTorch is checking those things in the Python codes, and warning with warnings package. But why LibTorch is also doing that at C++ level and warning with c10? That's pretty strange...

By the way, in most cases, you should resolve those warnings, instead of just ignoring or suppressing them.

image I just pushed the model to the GPU, which is a bit strange.

yueyinqiu commented 4 months ago

Hmm... It seems to happen because _toEpilog is accessing param.grad without checking that in advance.

lintao185 commented 4 months ago

image .zero_grad() will also cause warnings. image this.conv.weight[..] = nn.Parameter(x.view(1, c1, 1, 1)); The root cause has been found, and changing the above code to the following resolves the issue.

 using (torch.no_grad())
 {
     this.conv.weight[..] = nn.Parameter(x.view(1, c1, 1, 1));
 }
yueyinqiu commented 4 months ago

Hmmm... That's strange. In my expectation torch.no_grad should do nothing with that.

Anyway could you provide the implemention if RequiresGrad so I can reproduce it?

lintao185 commented 4 months ago
public static class ModuleExtend
{
    /// <summary>
    /// 设置模型参数是否需要更新
    /// </summary>
    /// <param name="module">模型</param>
    /// <param name="requires_grad">是否需要斜率</param>
    /// <returns>模型</returns>
    public static T RequiresGrad<T>(this T module,bool requires_grad) where T: nn.Module
    {
        foreach (var parameter in module.parameters())
        {
            parameter.requires_grad_(requires_grad);
        }

        return module;
    }
}
yueyinqiu commented 4 months ago

Ahhh? That seems to mean:

using TorchSharp;

var tensor = torch.zeros([1]);
var tensor2 = torch.zeros([1], requires_grad: true);
Console.WriteLine(tensor.requires_grad); // False
tensor[..] = tensor2;
Console.WriteLine(tensor.requires_grad); // True

...Hmmm....

yueyinqiu commented 4 months ago

Well PyTorch has the same behavior... I don't know...

import torch

tensor = torch.zeros([1])
tensor2 = torch.zeros([1], requires_grad=True)
tensor[:] = tensor2
print(tensor.requires_grad)  # True
yueyinqiu commented 4 months ago

A minial reproduction might be this:

using TorchSharp;

var p = torch.zeros([1]);
p[..] = torch.zeros([1], requires_grad: true);

Console.WriteLine(p.requires_grad);
Console.WriteLine(p.grad());

It prints

True
[W TensorBody.h:494] Warning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (function grad)
lintao185 commented 4 months ago

Yes

yueyinqiu commented 4 months ago

It seems to be a problem of LibTorch. PyTorch has the same problem:

import torch

p = torch.zeros([1])
p[:] = torch.zeros([1], requires_grad=True)
print(p.requires_grad)
print(p.grad)

It prints (due to some reason of PyCharm, the order is not kept):

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at aten\src\ATen/core/TensorBody.h:494.)
  print(p.grad)
True
None
lintao185 commented 4 months ago
    p = nn.Parameter(torch.ones(1))
    p2 = nn.Parameter(torch.ones(1))
    p.requires_grad_(False)
    print(p.requires_grad) #False
    print(p2.requires_grad)#True
    p.data[:]=p2
    print(p.requires_grad)#False
var p = nn.Parameter(torch.ones(1));
var p2 = nn.Parameter(torch.ones(1));
p.requires_grad_(false);
Console.WriteLine(p.requires_grad);//False
Console.WriteLine(p2.requires_grad);//True
p[..] = p2;
Console.WriteLine(p.requires_grad);//True
yueyinqiu commented 4 months ago

Oh yes so that's not a bug but a intentional design. And xxx.weight[:] = xxx is making the weight to be a non-leaf tensor, so we can't get grad on that.

lintao185 commented 4 months ago

The behaviors of PyTorch and TorchSharp are somewhat inconsistent.

yueyinqiu commented 4 months ago

It's because you are using '.data' in Python. If you want the same behavior in C#, use detach:

using static TorchSharp.torch;
using TorchSharp;

var p = nn.Parameter(torch.ones(1));
var p2 = nn.Parameter(torch.zeros(1));
p.requires_grad_(false);
Console.WriteLine(p.requires_grad); // False
Console.WriteLine(p2.requires_grad); // True
p.detach()[..] = p2;
Console.WriteLine(p.requires_grad); // False
Console.WriteLine((double)p); // 0

(We don't have .data in TorchSharp: https://github.com/dotnet/TorchSharp/issues/1305#issuecomment-2100082625)

NiklasGustafsson commented 3 months ago

@lintao185 -- given what @yueyinqiu said in https://github.com/dotnet/TorchSharp/issues/1323#issuecomment-2166999264, should this be closed, too?