Open chenzean opened 1 year ago
只有普通卷积的时候是可以正常训练的
您好,这个问题看起来和损失函数有关,可能需要更加具体的内容,才能判断清楚问题的来源
我尝试着对蛇形卷积中的激活函数进行了修改,将RELU修改为了GELU。就可以训练了。想问一下这个修改影响大吗
因为我使用torch.autograd.set_detect_anomaly(True)来确定出问题的地方,最后发现是蛇形卷积中的激活函数RELU出现了问题
很感谢您对我们方法在其他任务上的适配,也感谢给我们提供了解决方案,我们没有测试过GELU的影响,因此,可能得实验后再给您答复,但从原理来说,应该不会有太大的差别
好滴,谢谢
当我将蛇形卷积替换普通卷积的时候遇到了下面的错误: Traceback (most recent call last): File "E:\工作点2\train.py", line 332, in
main(args, config)
File "E:\工作点2\train.py", line 107, in main
loss_epoch_train, psnr_epoch_train, ssim_epoch_train = train(train_loader, device, net, criterion, optimizer, logger)
File "E:\工作点2\train.py", line 211, in train
loss_total.backward()
File "D:\Anaconda3\envs\pytorch_gpu\lib\site-packages\torch_tensor.py", line 488, in backward
self, gradient, retain_graph, create_graph, inputs=inputs
File "D:\Anaconda3\envs\pytorch_gpu\lib\site-packages\torch\autograd__init__.py", line 199, in backward
allow_unreachable=True, accumulate_grad=True) # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1280, 64, 5, 16]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
想问一下,这个需要怎么修改呢?