zhenghao977 / FCOS-PyTorch-37.2AP

A pure torch implement of FCOS 37.2AP
164 stars 40 forks source link

几处不明白的地方 #13

Closed Quanyin-li closed 3 years ago

Quanyin-li commented 3 years ago

我有几处不明白的地方,希望您不吝赐教。 1.readme写的78.7map,百度云盘链接实际却是77.8map 2.我用voc_77.8.pth进行detect.py,不匹配,没办法跑起来。voc_77.8.pth只能用来验证,不能推理? 3.我下载了VectXmy的voc推理权重,可以跑起来,可是图片没有任何预测的结果 我是小白,很多不懂,希望您能解答一下

Kuuuo commented 3 years ago

我有几处不明白的地方,希望您不吝赐教。 1.readme写的78.7map,百度云盘链接实际却是77.8map 2.我用voc_77.8.pth进行detect.py,不匹配,没办法跑起来。voc_77.8.pth只能用来验证,不能推理? 3.我下载了VectXmy的voc推理权重,可以跑起来,可是图片没有任何预测的结果 我是小白,很多不懂,希望您能解答一下

同样的问题:训练完成后,用detect.py跑不起来,无法测试图片?请问楼上解决这个问题了吗

zhenghao977 commented 3 years ago

@Kuuuo @Quanyin-li 可以提供一下报错信息么?

Kuuuo commented 3 years ago

@Kuuuo @Quanyin-li 可以提供一下报错信息么?

谢谢您,我已经解决这个问题啦。下面是我遇到的两个报错信息:

第一个报错信息: share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py Traceback (most recent call last): File "detect.py", line 87, in model.load_state_dict(torch.load("./checkpoint/model_1.pth",map_location=torch.device('cuda'))) File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for DataParallel: size mismatch for module.body.head.cls_out.weight: copying a param with shape torch.Size([720, 256, 3, 3]) from checkpoint, the shape in current model is torch.Size([45, 256, 3, 3]). size mismatch for module.body.head.cls_out.bias: copying a param with shape torch.Size([720]) from checkpoint, the shape in current model is torch.Size([45]).

我的出错原因是detect.py中的类别数和config中的类别数不一致,改成一样的就可以了。

第二个报错信息: share@-System-Product-Name:~/RetinaNet-Pytorch-36.4AP-master$ python detect.py info====>success freeze bn info=====> success freeze stage 1 ===>success loading model Traceback (most recent call last): File "detect.py", line 108, in out=model(img1.unsqueeze_(dim=0)) File "/home/share/.local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call result = self.forward(*input, **kwargs) File "/home/share/.local/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 146, in forward "them on device: {}".format(self.src_device_obj, t.device)) RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu 第二个原因是因为没有指定gpu,我添加了一份使用gpu和cuda的代码可以了。 添加如下: USE_CUDA = torch.cuda.is_available() device = torch.device("cuda:0" if USE_CUDA else "cpu") model = torch.nn.DataParallel(model, device_ids=[0, 1])

model = torch.nn.DataParallel(model)

model.to(device)
zhenghao977 commented 3 years ago

@Quanyin-li 明白 config里面的cls_num我会去fixed一下