tianbaochou / NasUnet

170 stars 45 forks source link

use my images,the error RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3 appears. #6

Closed berisfu closed 5 years ago

berisfu commented 5 years ago

Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 262, in train predicts = self.model(input) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 181, in forward return self.net(x, weights1_down, weights1_up, weights2_down, weights2_up) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/nas_unet_search.py", line 68, in forward s0, s1 = s1, cell(s0, s1, weights1_down, weights2_down) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(input, kwargs) File "../search/backbone/cell.py", line 82, in forward tmp_list += [self._ops[offset+j](h, weight1[offset+j], weight2[offset+j])] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/cell.py", line 32, in forward rst = sum(w op(x) for w, op in zip(weights2, self._ops)) RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3

tianbaochou commented 5 years ago

Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 262, in train predicts = self.model(input) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 181, in forward return self.net(x, weights1_down, weights1_up, weights2_down, weights2_up) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 68, in forward s0, s1 = s1, cell(s0, s1, weights1_down, weights2_down) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call* result = self.forward(input, kwargs) File "../search/backbone/cell.py", line 82, in forward tmp_list += [self._ops[offset+j](h, weight1[offset+j], weight2[offset+j])] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/cell.py", line 32, in forward rst = sum(w op(x) for w, op in zip(weights2, self._ops)) RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3

When the image size [W and H] is not the [2^N] or [2^N x M], where N needs large than or equal than layer depth. Because the number cells of down-sampling and up-sampling is the same. If you really want to set the image size [110 x 110 x 7], you Need track the down-sampling step, and make sure the output_padding in here is 1 when the previous layer size is odd but is 0 for even. I suggest you resize the image
size == 96 or 128. and interpolate the size of output into 110. Good Luck!

berisfu commented 5 years ago

Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 262, in train predicts = self.model(input) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 181, in forward return self.net(x, weights1_down, weights1_up, weights2_down, weights2_up) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 68, in forward s0, s1 = s1, cell(s0, s1, weights1_down, weights2_down) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call* result = self.forward(input, kwargs) File "../search/backbone/cell.py", line 82, in forward tmp_list += [self._ops[offset+j](h, weight1[offset+j], weight2[offset+j])] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/cell.py", line 32, in forward rst = sum(w op(x) for w, op in zip(weights2, self._ops)) RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3

When the image size [W and H] is not the [2^N] or [2^N x M], where N needs large than or equal than layer depth. Because the number cells of down-sampling and up-sampling is the same. If you really want to set the image size [110 x 110 x 7], you Need track the down-sampling step, and make sure the output_padding in here is 1 when the previous layer size is odd but is 0 for even. I suggest you resize the image size == 96 or 128. and interpolate the size of output into 110. Good Luck!

N should larger than layer depth?where to set the layer depth?in nas_unet_nerve.yml?

tianbaochou commented 5 years ago

Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 262, in train predicts = self.model(input) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 181, in forward return self.net(x, weights1_down, weights1_up, weights2_down, weights2_up) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 68, in forward s0, s1 = s1, cell(s0, s1, weights1_down, weights2_down) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call* result = self.forward(input, kwargs) File "../search/backbone/cell.py", line 82, in forward tmp_list += [self._ops[offset+j](h, weight1[offset+j], weight2[offset+j])] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/cell.py", line 32, in forward rst = sum(w op(x) for w, op in zip(weights2, self._ops)) RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3

When the image size [W and H] is not the [2^N] or [2^N x M], where N needs large than or equal than layer depth. Because the number cells of down-sampling and up-sampling is the same. If you really want to set the image size [110 x 110 x 7], you Need track the down-sampling step, and make sure the output_padding in here is 1 when the previous layer size is odd but is 0 for even. I suggest you resize the image size == 96 or 128. and interpolate the size of output into 110. Good Luck!

N should larger than layer depth?where to set the layer depth?in nas_unet_nerve.yml?

Yeah. Each training and searching can set depth separately

berisfu commented 5 years ago

Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 262, in train predicts = self.model(input) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 181, in forward return self.net(x, weights1_down, weights1_up, weights2_down, weights2_up) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "../search/backbone/nas_unet_search.py", line 68, in forward s0, s1 = s1, cell(s0, s1, weights1_down, weights2_down) File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call* result = self.forward(input, kwargs) File "../search/backbone/cell.py", line 82, in forward tmp_list += [self._ops[offset+j](h, weight1[offset+j], weight2[offset+j])] File "/root/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "../search/backbone/cell.py", line 32, in forward rst = sum(w op(x) for w, op in zip(weights2, self._ops)) RuntimeError: The size of tensor a (12) must match the size of tensor b (13) at non-singleton dimension 3

When the image size [W and H] is not the [2^N] or [2^N x M], where N needs large than or equal than layer depth. Because the number cells of down-sampling and up-sampling is the same. If you really want to set the image size [110 x 110 x 7], you Need track the down-sampling step, and make sure the output_padding in here is 1 when the previous layer size is odd but is 0 for even. I suggest you resize the image size == 96 or 128. and interpolate the size of output into 110. Good Luck!

I have crop the data from 110 to 96, the error message disappear. but new error occur:

/pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [65,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [66,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [80,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [81,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [83,0,0] Assertion t >= 0 && t < n_classes failed. /pytorch/aten/src/THCUNN/SpatialClassNLLCriterion.cu:99: void cunn_SpatialClassNLLCriterion_updateOutput_kernel(T , T , T , long , T , int, int, int, int, int, long) [with T = float, AccumT = float]: block: [0,0,0], thread: [84,0,0] Assertion t >= 0 && t < n_classes failed. Traceback (most recent call last): File "search_cell_sim.py", line 312, in search_network.run() File "search_cell_sim.py", line 212, in run self.train() File "search_cell_sim.py", line 266, in train self.train_loss_meter.update(train_loss.item()) RuntimeError: CUDA error: device-side assert triggered