ruoqianguo / FPN_Pytorch

Base jwyang/fpn.pytorch, train FPN on Pascal VOC get 80.5 mAP
MIT License
103 stars 19 forks source link

RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /opt/conda/conda #2

Open KevinQian97 opened 6 years ago

KevinQian97 commented 6 years ago

Hello,I used your code to train. However, the model terminate after first iter Would you please help me find out the problem? Thank you Here are my Trace backs: [session 1][epoch 1][iter 0] loss: 4.0006, lr: 1.00e-02 fg/bg=(128/384), time cost: 7.218862 rpn_cls: 0.6919, rpn_box: 0.1386, rcnn_cls: 2.8319, rcnn_box 0.3382 Traceback (most recent call last): File "trainval_net.py", line 330, in roi_labels = FPN(im_data, im_info, gt_boxes, num_boxes) File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 357, in call result = self.forward(*input, **kwargs) File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 73, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 83, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/zhiqi.cheng/anaconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 67, in parallel_apply raise output RuntimeError: invalid argument 2: Input tensor must have same size as output tensor apart from the specified dimension at /opt/conda/conda-bld/pytorch_1518238409320/work/torch/lib/THC/generic/THCTensorScatterGather.cu:29

KevinQian97 commented 6 years ago

I found that the code runs normally on faster-rcnn. But if I use the code of fpn, it failed. So I guess the problem happens in fpn.py, but I still can't find out why. What's more, I used this model to train my personal data, if I changed the data back to origin Voc2007, it works. That's strange. I just changed my personal data into the form of Voc2007. Here is one of my annotation file:

train VIRAT_S_000000.mp4_0 C:/Users/Kevin Qian/Downloads/images/train/VIRAT_S_000000.mp4_0.jpg Unknown 1920 1080 3 0 Other 0 636 723 655 787 Other 0 411 618 438 703 Person 0 349 709 410 850 Other 0 760 758 778 831 Person 0 1386 245 1432 354 Person 0 276 688 345 845 Other 0 512 687 541 747

and here is the annotation file in original voc2007

VOC2007 009962.jpg The VOC2007 Database PASCAL VOC2007 flickr 246788553 Tool - Wroclaw Milosz J. 500 375 3 0 chair Right 1 0 211 192 324 326 person Unspecified 1 0 162 72 273 248 person Right 1 0 250 68 473 312 person Right 1 0 4 2 253 374 diningtable Unspecified 1 1 358 216 500 375
KevinQian97 commented 6 years ago

I have solved the problem through downloading the whole pascal data set and change the data part instead of directly using my personal data. But it's interesting that I think your code is based on that of jwyang. But through using the method of changing data part, I can successfully use your code to train but that still doesn't work when it comes to jwyang's work. So, would you mind telling me if you changed some codes which is relevant to reading data from data set?

jacco09 commented 6 years ago

I met the same problem.Could you share your solution in detail.Thanks!

JingXiaolun commented 6 years ago

@KevinQian97 ,I met the same problem.Could you share your solution in detail.Thanks!

hailey94 commented 5 years ago

@KevinQian97 , I met the same problem.Could you share your solution in detail.Thanks!