Pretrained Model Architecture Mismatch and Training Errors in SparseMat

ztMotaLee commented 10 months ago

Hello Thank you for your wonderful work on Image Matting! I encountered some problems when testing the pretrained model: It seems that the pretrained model you provided does not match the shape of the network, for example:

... RuntimeError: Error(s) in loading state_dict for SparseMat: size mismatch for shm.backbone.conv1.weight: copying a param with shape torch.Size([3, 3, 4, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 4]).

At the same time, when I tried to retrain the model, I encountered the following problems: ... SparseMat-main/model/backbones/sparse_resnet_bn.py", line 61, in forward out = self.conv1(x) ... AssertionError: channel size mismatch.

Could you please check the code carefully or provide a corrected version? Thank you very much for your attention to these matters and for your dedication to open-source development.

hyunghoon-kim commented 8 months ago

so do I

RuntimeError: Error(s) in loading state_dict for SparseMat: size mismatch for shm.backbone.conv1.weight: copying a param with shape torch.Size([3, 3, 4, 64]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 4]).

jiamingkong commented 3 months ago

I encountered the same issue. The SparseResNet18 has a setting called in_planes which was set to 128 while initiating, it has 3 SparseConv2D layers that turns x from shape [N, 4] into [N, 64], but then layer1 in SparseResNet18 wants to see a tensor of [N, 128] instantly. Your help would be appreciated!

yuqilol commented 2 months ago

我改成功了不过又出现了新的报错

nowsyn / SparseMat

Pretrained Model Architecture Mismatch and Training Errors in SparseMat #14