PaddlePaddle / PaDiff

Paddle Automatically Diff Precision Toolkits.
46 stars 13 forks source link

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn #98

Closed GreatV closed 1 year ago

GreatV commented 1 year ago
import torch

import paddle

import numpy as np

from padiff import create_model, auto_diff

class SparseDownSampleCloseBase(torch.nn.Module):
    def __init__(self, stride):
        super(SparseDownSampleCloseBase, self).__init__()
        self.pooling = torch.nn.MaxPool2d(stride, stride)
        self.large_number = 600

    def forward(self, d, mask):
        encode_d = -(1 - mask) * self.large_number - d

        d = -self.pooling(encode_d)
        mask_result = self.pooling(mask)
        d_result = d - (1 - mask_result) * self.large_number

        return d_result, mask_result

class SparseDownSampleCloseRaw(paddle.nn.Layer):
    def __init__(self, stride):
        super(SparseDownSampleCloseRaw, self).__init__()
        self.pooling = paddle.nn.MaxPool2D(stride, stride)
        self.large_number = 600

    def forward(self, d, mask):
        encode_d = -(1 - mask) * self.large_number - d

        d = -self.pooling(encode_d)
        mask_result = self.pooling(mask)
        d_result = d - (1 - mask_result) * self.large_number

        return d_result, mask_result

module = create_model(SparseDownSampleCloseBase(1))
layer = create_model(SparseDownSampleCloseRaw(1))

x = np.random.randn(1, 320, 320, 1).astype("float32")
y = np.random.randn(1, 320, 320, 1).astype("float32")
inp = ({"d": torch.as_tensor(x),
        "mask": torch.as_tensor(y)},
        {"d": paddle.to_tensor(x),
        "mask": paddle.to_tensor(y)})
auto_diff(module, layer, inp, auto_weights=True, atol=1e-4)
Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
feifei-111 commented 1 year ago

你好,可以检查下网络结构有没有写对,我测试了一下,直接运行就会报错

feifei-111 commented 1 year ago

image image

GreatV commented 1 year ago
import paddle

class SparseDownSampleCloseRaw(paddle.nn.Layer):
    def __init__(self, stride):
        super().__init__()
        self.pooling = paddle.nn.MaxPool2D(stride, stride)
        self.large_number = 600

    def forward(self, d, mask):
        encode_d = -(1 - mask) * self.large_number - d

        d = -self.pooling(encode_d)
        mask_result = self.pooling(mask)
        d_result = d - (1 - mask_result) * self.large_number

        return d_result, mask_result

if __name__ == "__main__":
    model = SparseDownSampleCloseRaw(1)
    paddle.summary(model, [(3, 320, 320, 1), (3, 320, 320, 1)])

会不会是因为它没有可训练参数

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
  MaxPool2D-1    [[3, 320, 320, 1]]    [3, 320, 320, 1]          0       
===========================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 2.34
Forward/backward pass size (MB): 2.34
Params size (MB): 0.00
Estimated Total Size (MB): 4.69
---------------------------------------------------------------------------
feifei-111 commented 1 year ago

你好,我刚刚测试了一下,torch会在backward报错,刚刚你回复的是paddle模型,paddle前向直接会报错(输入shape要求是4/5维的,但是实际是3维)

image
GreatV commented 1 year ago

是的,我改了一下shape,还是报错。

feifei-111 commented 1 year ago

torch不清楚,paddle没有可训练参数也可以backward的,这里前向可能是OP的问题,我帮你确认下

feifei-111 commented 1 year ago

同学,这个是同时import了paddle和torch导致的错误,应该算bug😵

feifei-111 commented 1 year ago

我换了张卡就跑过了。。刚刚好像是卡占满了报的cuda error,你再试试呢? image

GreatV commented 1 year ago

我换了电脑,改了shape,跑最上面的精度比较,还是会报相同的错误。这个是同时import了paddle和torch导致的错误