Sunarker / Collaborative-Learning-for-Weakly-Supervised-Object-Detection

MIT License
50 stars 13 forks source link

Initialization for premodel of different nets #2

Open ShawnLiu1011 opened 5 years ago

ShawnLiu1011 commented 5 years ago

Traceback (most recent call last): File "./tools/trainval_net.py", line 150, in max_iters=args.max_iters) File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 365, in train_net sw.train_model(max_iters) File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 239, in train_model lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = self.initialize() File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/model/train_val.py", line 179, in initialize self.net.load_state_dict(model_dict) File "/home/nieqinqin/liuxiaoyu/CLWS/tools/../lib/nets/network.py", line 609, in load_state_dict nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())}) File "/home/nieqinqin/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 487, in load_state_dict .format(name, own_state[name].size(), param.size())) RuntimeError: While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]). Command exited with non-zero status 1 7.33user 5.80system 0:13.57elapsed 96%CPU (0avgtext+0avgdata 2171288maxresident)k 0inputs+8outputs (0major+709527minor)pagefaults 0swaps

ww1024cc commented 5 years ago

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check https://github.com/Sunarker/Collaborative-Learning-for-Weakly-Supervised-Object-Detection/blob/4bd0df739b8bc3c50cc8cdfed905caaae0b12b06/lib/nets/vgg16.py#L26 and https://github.com/Sunarker/Collaborative-Learning-for-Weakly-Supervised-Object-Detection/blob/4bd0df739b8bc3c50cc8cdfed905caaae0b12b06/lib/nets/network.py#L398 in origin codes.

ShawnLiu1011 commented 5 years ago

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py

Line 26 in 4bd0df7

self._fc7_channels = 4096 and Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py

Line 398 in 4bd0df7

self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here: https://github.com/Sunarker/Collaborative-Learning-for-Weakly-Supervised-Object-Detection/blob/4bd0df739b8bc3c50cc8cdfed905caaae0b12b06/lib/nets/resnet_v1.py#L215 I'm tring to download vgg premodel, but are there any solutions to use resnet?

Jngwl commented 5 years ago

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py Line 26 in 4bd0df7 self._fc7_channels = 4096 and Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py Line 398 in 4bd0df7 self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here: Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py

Line 215 in 4bd0df7

self._fc7_channels = 2048

I'm tring to download vgg premodel, but are there any solutions to use resnet?

Hello,shawnLiu. I encountered same error as you when I ran ./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre

While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...

after that, I set

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py

self._fc7_channels = 4096

but i encountered the RuntimeError cuda runtime error (2) : out of memory Could you find any solution to use resnet ? Thank you

ShawnLiu1011 commented 5 years ago

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

alexshaodong commented 5 years ago

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

Jngwl commented 5 years ago

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

you can adjust the batch size by modifing the file ./experiments/cfgs/res101.yml

alexshaodong commented 5 years ago

@Jngwl I guess it's because of your GPU memory. Maybe it need more than 16 gb for resnet. Sorry, I didn't try resnet after that.

Hello! I also encounter this situation, I would like to ask in which file to adjust the batch size? I can't find it. Thank you!

you can adjust the batch size by modifing the file ./experiments/cfgs/res101.yml

Hello! Thank you for your answer! My GPU has a capacity of 12 gb. I have changed the batch_size in. / experiments / cfgs / res101. yml file to 2, but still out of memory. How many values did you change and finally run?

weilaizhe666 commented 4 years ago

We again check the WSDDN pretrained-model and make sure no wrong. cls_score_net.weight is actually a [20,4096] tensor. You should check Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/vgg16.py Line 26 in 4bd0df7 self._fc7_channels = 4096 and Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/network.py Line 398 in 4bd0df7 self.cls_score_net = nn.Linear(self._fc7_channels, self._num_classes) # between class in origin codes.

Thanks for your help! Sorry for my wrong issue. Actually, i find the problem is i used resnet whose cls_score_net.weight is a [20,2048] tensor at here: Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py Line 215 in 4bd0df7 self._fc7_channels = 2048 I'm tring to download vgg premodel, but are there any solutions to use resnet?

Hello,shawnLiu. I encountered same error as you when I ran ./experiments/scripts/train.sh 0 pascal_voc res101 voc07_wsddn_pre

While copying the parameter named cls_score_net.weight, whose dimensions in the model are torch.Size([20, 2048]) and whose dimensions in the checkpoint are torch.Size([20, 4096]), ...

after that, I set

Collaborative-Learning-for-Weakly-Supervised-Object-Detection/lib/nets/resnet_v1.py self._fc7_channels = 4096

but i encountered the RuntimeError cuda runtime error (2) : out of memory Could you find any solution to use resnet ? Thank you

so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity

weilaizhe666 commented 4 years ago

@Jngwl so,how do you solve this problem?I encounter this problem,but maybe ,its not because the GPu memory.But I can't solve it with my ablity