Closed fangxu622 closed 5 years ago
我不确定这个问题是代码的问题 还是pytorch bug的问题 @Jiangfeng-Xiong
我也改过batchsize,但是没出现过这个问题。最大可能是不同版本的问题,我使用的环境是python2.7,Ubuntu, pytorch 3.0
@Jiangfeng-Xiong 我今天换到了centos 7 ,python 2.7 pytorch 0.3.1 还是一样的问题,由于是cuda 9没办法换到0.3.0~ 难道是0.3.0 与0.3.1 的差别·~?
(python27) bash-4.2$ ./run_train.sh
epoch 0 with learning rate: 0.001000
/home/sensetime/test/satellite_seg/utils/loss.py:16: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
log_p = F.log_softmax(input)
/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py:465: UserWarning: self and mask not broadcastable, but have the same number of elements. Falling back to deprecated pointwise behavior.
return tensor.masked_select(mask)
Traceback (most recent call last):
File "train.py", line 193, in <module>
train(args)
File "train.py", line 152, in train
loss.backward()
File "/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
variables, grad_variables, retain_graph)
File "/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/function.py", line 91, in apply
return self._forward_cls.backward(self, *args)
File "/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/_functions/tensor.py", line 481, in backward
grad_tensor = grad_tensor.masked_scatter(mask, grad_output)
File "/home/sensetime/test/anaconda2/envs/python27/lib/python2.7/site-packages/torch/autograd/variable.py", line 427, in masked_scatter
return self.clone().masked_scatter_(mask, variable)
RuntimeError: invalid argument 1: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at /opt/conda/conda-bld/pytorch_1518238441757/work/torch/lib/THC/generic/THCTensor.c:326
可能是,pytorch已经更新挺多了,可能跟这个问题有关。你试试用github上最新的pytorch源码编译安装试试 @fangxu622
@Jiangfeng-Xiong 我从最新源码编译了一次~但是在cross_entropy2d 函数里面提示了这样一个错误,我目前想把你这个代码跑通然后用自己的数据·· File "/home/sensetime/test/satellite_seg3/utils/loss.py", line 20, in cross_entropy2d log_p = log_p[target.view(n, h, w, 1).repeat(1, 1, 1, c) >= 0] IndexError: too many indices for tensor of dimension 2
epoch 0 with learning rate: 0.001000
/home/sensetime/test/anaconda2/envs/py3/lib/python3.6/site-packages/torch/nn/functional.py:1762: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
/home/sensetime/test/satellite_seg3/utils/loss.py:18: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
log_p = F.log_softmax(input)
Traceback (most recent call last):
File "train.py", line 193, in <module>
train(args)
File "train.py", line 149, in train
loss = cross_entropy2d(outputs, labels,weights_per_class)
File "/home/sensetime/test/satellite_seg3/utils/loss.py", line 20, in cross_entropy2d
log_p = log_p[target.view(n, h, w, 1).repeat(1, 1, 1, c) >= 0]
IndexError: too many indices for tensor of dimension 2
总感觉这个地方的代码维度好像不对·~
目前我把 cross_entropy2d函数改了一下,不知道有没有改变原来的意思·~~~
还是确定 cross_entropy2d 函数有问题 维度不对··~~,然后visdom的画线代码 X,Y 的维度是否添加错了
在train.py 里面
vis.line我修改如下:看是否正确,否则会提示Y必须是一维,且X,Y 维度必须相同
vis.line(
X=torch.ones((1)).cpu()*iter,
Y=torch.Tensor([loss.data[0]]).cpu(),
win=loss_window,
update='append')
你好,我想请问你代码调通了可以正常训练了吗?最近我也碰到你类似的问题~
您好,我也遇到了类似的问题,请问你们是怎么解决的?
我使用的是python 3 windows pytorch3.1,,显卡为1080ti 11G
python train.py --arch pspnet-densenet-s1s2-crf2 --img_rows 256 --img_cols 256 --n_epoch 50 --l_rate 1e-3 --batch_size 32 --gpu 0 --step 50 --traindir "dataset/stage1&stage2-train-crf2"
当batch_size =32的时候,提示内存不足·
当batch_size 改为4 或者8 ,提示了以下错误
RuntimeError: invalid argument 1: the number of sizes provided must be greater or equal to the number of dimensions in the tensor at c:\anaconda2\conda-bld\pytorch_1