Open 9711128 opened 3 years ago
有没有文字描述呀,我看不了这个图
TypeError: expected seqence object with len>=0 or a single integer
TypeError: expected seqence object with len>=0 or a single integer 同样这个问题
我赌五毛版本问题
you win
no I lose
you win
我遇到同样的问题,请问您怎么解决的,我把torch升级到1.2.0还是不行
you win
我遇到同样的问题,请问您怎么解决的,我把torch升级到1.2.0还是不行
https://github.com/open-mmlab/mmdetection/issues/2842 在frcnn.py中 def detect_image(self, image): with torch.no_grad():
if isinstance(self.model, torch.nn.DataParallel):
self.model.device_ids = [0]
啊这,是什么东西
在rpn.forward里面,roi返回之前先转到cpu上了,从tensor变成了ndarray,所以dataparallel处理不了了,参考https://discuss.pytorch.org/t/nn-dataparallel-typeerror-expected-sequence-object-with-len-0-or-a-single-integer/97082/23
Yes. Sorry, in this line I put tensor to cpu before gather.
return torch.unsqueeze(loss, 0), predicted_interaction.cpu().detach().view(-1, 1), correct_interaction.cpu().detach().view(-1, 1)
啥意思啊,我为什么没听懂…要是哪段代码有问题,我还得改呢,我这里运行没报错,我不知道是啥问题
https://github.com/bubbliiiing/faster-rcnn-pytorch/blob/ef53d380c71c3cc30b35ca1474b601c1d1f33574/frcnn.py#L118 https://github.com/bubbliiiing/faster-rcnn-pytorch/blob/ef53d380c71c3cc30b35ca1474b601c1d1f33574/nets/frcnn.py#L66
def forward(self, x, scale=1.):
img_size = x.shape[2:]
h = self.extractor(x)
rpn_locs, rpn_scores, rois, roi_indices, anchor = \
self.rpn.forward(h, img_size, scale)
# print(np.shape(h))
# print(np.shape(rois))
# print(roi_indices)
roi_cls_locs, roi_scores = self.head.forward(h, rois, roi_indices)
return roi_cls_locs, roi_scores, rois, roi_indices
最后的四个返回值,后两个是ndarray的,不是tensor,按论坛里的说法,Dataparallel多卡分配计算完要合并结果,ndarray合并不了,你在rpn里面把roi放到CPU上了,所以这样。 https://github.com/bubbliiiing/faster-rcnn-pytorch/blob/ef53d380c71c3cc30b35ca1474b601c1d1f33574/nets/rpn.py#L118
for i in range(n):
roi = self.proposal_layer(
rpn_locs[i].cpu().data.numpy(),
rpn_fg_scores[i].cpu().data.numpy(),
anchor, img_size,
scale=scale)
batch_index = i * np.ones((len(roi),), dtype=np.int32)
rois.append(roi)
roi_indices.append(batch_index)
我怀疑是版本问题,因为你代码里推理的时候把环境变量设置成了1张卡,可能你的版本没问题,别人的版本Dataparallel还是按照多卡的机制gather的,就失败了 其实推理阶段直接把Dataparallel删了就行