junfu1115 / DANet

Dual Attention Network for Scene Segmentation (CVPR2019)
MIT License
2.39k stars 483 forks source link

训练时网络参数报错怎么解决? #133

Open Eileen000 opened 3 years ago

Eileen000 commented 3 years ago

报错如下: Traceback (most recent call last): File "/drive/MyDrive/DANet-0.5.0/danet/train.py", line 199, in trainer = Trainer(args) File "/drive/MyDrive/DANet-0.5.0/danet/train.py", line 67, in init multi_dilation=args.multi_dilation) File "/drive/MyDrive/DANet-0.5.0/encoding/models/init.py", line 17, in get_segmentation_model return modelsname.lower() File "/drive/MyDrive/DANet-0.5.0/encoding/models/danet.py", line 119, in get_danet model = DANet(datasets[dataset.lower()].NUM_CLASS, backbone=backbone, root=root, kwargs) File "/drive/MyDrive/DANet-0.5.0/encoding/models/danet.py", line 40, in init super(DANet, self).init(nclass, backbone, aux, se_loss, norm_layer=norm_layer, kwargs) File "/drive/MyDrive/DANet-0.5.0/encoding/models/base.py", line 46, in init multi_grid=multi_grid,multi_dilation=multi_dilation) File "/drive/MyDrive/DANet-0.5.0/encoding/dilated/resnet.py", line 283, in resnet101 get_model_file('resnet101', root=root)), strict=False) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 830, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for ResNet: size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 1, 1]). size mismatch for layer1.0.downsample.0.weight: copying a param with shape torch.Size([256, 128, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 64, 1, 1]).

请问如何解决?

Eileen000 commented 3 years ago

这个报错呢? Traceback (most recent call last): File "/drive/MyDrive/DANet-0.5.0/danet/train.py", line 204, in trainer.training(epoch) File "/drive/MyDrive/DANet-0.5.0/danet/train.py", line 128, in training outputs = self.model(image) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 141, in forward return self.module(*inputs[0], *kwargs[0]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(input, kwargs) File "/drive/MyDrive/DANet-0.5.0/encoding/models/danet.py", line 45, in forward , , c3, c4 = self.base_forward(x) File "/drive/MyDrive/DANet-0.5.0/encoding/models/base.py", line 61, in base_forward c1 = self.pretrained.layer1(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/container.py", line 92, in forward input = module(input) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(*input, *kwargs) File "/drive/MyDrive/DANet-0.5.0/encoding/dilated/resnet.py", line 96, in forward out = self.conv1(x) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 489, in call result = self.forward(input, kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py", line 320, in forward self.padding, self.dilation, self.groups) RuntimeError: Given groups=1, weight of size [64, 128, 1, 1], expected input[1, 64, 192, 192] to have 128 channels, but got 64 channels instead

JIA-HONG-CHU commented 3 years ago

在danet.py 更改input 可以使用 transpose 和 view套件

wlj567 commented 3 years ago

在 danet.py 更改输入可以使用转置和视图套件 您好,能具体点吗,谢谢了

JIA-HONG-CHU commented 3 years ago

https://zhuanlan.zhihu.com/p/342675997 。這篇有講蠻淺顯易懂的

wlj567 commented 3 years ago

https://zhuanlan.zhihu.com/p/342675997 。這篇有講蠻淺顯易懂的

麻烦了,谢谢

wlj567 commented 3 years ago

@Eileen000 您好,请问你能运行完整代码吗,方便加个联系方式吗,谢谢了

Eileen000 commented 3 years ago

能运行 需要改一些地方 还要指定pytorch版本 以及你的显卡配置不能太低 我有在本地电脑和colab上运行过 需要代码可以 但有偿 ------ 原始邮件 ------ @.>; 发送时间:2021年9月29日(星期三) 上午9:16 @.>; @.**@.>; 主题:Re: [junfu1115/DANet] 训练时网络参数报错怎么解决? (#133)

@Eileen000 您好,请问你能运行完整代码吗,方便加个联系方式吗,谢谢了

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

JIA-HONG-CHU commented 3 years ago

我有用 encnet danet dranet 用在 mmsegmentation 套件上並且套用swin transformer backbone而不是resnet101,你可以參考 https://github.com/JIA-HONG-CHU/Swin-Transformer-add-EncNet-DaNet-DraNet-for-semantic-segmentation-on-Statelite-Dataset

wlj567 commented 3 years ago

十分感谢你们的回复,作为新人,代码上的问题颇多,我再好好琢磨琢磨吧。打扰了

wlj567 commented 3 years ago

十分感谢你们的回复,作为新人,代码上的问题颇多,我再好好琢磨琢磨吧。打扰了

wlj567 commented 2 years ago

您们好,我想问问训练DANet,怎么用cityscapes数据集进行训练?希望能有空回复一下,先谢谢了。