YaoleiQi / DSCNet

Pytorch Implement of Dynamic Snake Convolution (ICCV2023)
432 stars 37 forks source link

当蛇形卷积插入自己model中遇到的问题 #18

Open chenzean opened 1 year ago

chenzean commented 1 year ago

当我将蛇形卷积替换普通卷积的时候遇到了下面的错误: Traceback (most recent call last): File "E:\工作点2\train.py", line 332, in main(args, config) File "E:\工作点2\train.py", line 107, in main loss_epoch_train, psnr_epoch_train, ssim_epoch_train = train(train_loader, device, net, criterion, optimizer, logger) File "E:\工作点2\train.py", line 211, in train loss_total.backward() File "D:\Anaconda3\envs\pytorch_gpu\lib\site-packages\torch_tensor.py", line 488, in backward self, gradient, retain_graph, create_graph, inputs=inputs File "D:\Anaconda3\envs\pytorch_gpu\lib\site-packages\torch\autograd__init__.py", line 199, in backward allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1280, 64, 5, 16]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

想问一下,这个需要怎么修改呢?

chenzean commented 1 year ago

只有普通卷积的时候是可以正常训练的

YaoleiQi commented 1 year ago

您好,这个问题看起来和损失函数有关,可能需要更加具体的内容,才能判断清楚问题的来源

chenzean commented 1 year ago

我尝试着对蛇形卷积中的激活函数进行了修改,将RELU修改为了GELU。就可以训练了。想问一下这个修改影响大吗

chenzean commented 1 year ago

因为我使用torch.autograd.set_detect_anomaly(True)来确定出问题的地方,最后发现是蛇形卷积中的激活函数RELU出现了问题

YaoleiQi commented 1 year ago

很感谢您对我们方法在其他任务上的适配,也感谢给我们提供了解决方案,我们没有测试过GELU的影响,因此,可能得实验后再给您答复,但从原理来说,应该不会有太大的差别

chenzean commented 1 year ago

好滴,谢谢