dongzhang89 / SR-SS

Implementation for paper: Self-Regulation for Semantic Segmentation
31 stars 4 forks source link

When I try to run your code, there are many bugs #6

Open TyroneLi opened 3 years ago

TyroneLi commented 3 years ago

I just run follow your readme and modify my data path. I meet: torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) label: torch.Size([1, 315, 315]) tensor([[0.6681, 0.5503, 0.4859, 0.3670, 0.4030, 0.4132, 0.3604, 0.5969, 0.5407, 0.4653, 0.5853, 0.6330, 0.4966, 0.3281, 0.4031, 0.5070, 0.6057, 0.4516, 0.4128, 0.5618, 0.3800]], device='cuda:0', grad_fn=<SigmoidBackward>) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) torch.Size([1, 21, 315, 315]) label: torch.Size([1, 315, 315]) [W TensorIterator.cpp:918] Warning: Mixed memory format inputs detected while calling the operator. The operator will output contiguous tensor even if some of the inputs are in channels_last format. (function operator()) Time:2021-09-29 22:47:34 [iter 0/40000] loss: 354.8209, lr: 1.000000e-02 tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]], device='cuda:0', grad_fn=<SigmoidBackward>) torch.Size([1, 21, 270, 270]) torch.Size([1, 21, 270, 270]) torch.Size([1, 21, 270, 270]) torch.Size([1, 21, 270, 270]) torch.Size([1, 21, 270, 270]) label: torch.Size([1, 270, 270]) THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCTensorMath.cu line=19 error=710 : device-side assert triggered Traceback (most recent call last): File "trainval.py", line 166, in <module> loss2_segmentation = get_seg_loss(out, label) File "/home/hadoop-vacv/cephfs/data/lijinlong/codes/SEMISSS/20210922/SR-SS/lib/models/losses.py", line 132, in get_seg_loss loss += criterion(m(preds[i]), label) File "/home/hadoop-vacv/cephfs/data/lijinlong/local/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/hadoop-vacv/cephfs/data/lijinlong/local/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 211, in forward return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction) File "/home/hadoop-vacv/cephfs/data/lijinlong/local/anaconda3/envs/mmdet/lib/python3.8/site-packages/torch/nn/functional.py", line 2220, in nll_loss ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index) RuntimeError: cuda runtime error (710) : device-side assert triggered at /pytorch/aten/src/THC/generic/THCTensorMath.cu:19 /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [0,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [1,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [2,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [3,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [4,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [5,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [6,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [7,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [8,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [9,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [10,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [11,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [12,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [13,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [14,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [15,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [16,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [17,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [18,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [19,0,0] Assertioninput_val >= zero && input_val <= onefailed. /pytorch/aten/src/ATen/native/cuda/Loss.cu:106: operator(): block: [0,0,0], thread: [20,0,0] Assertioninput_val >= zero && input_val <= onefailed.