noahzn / Lite-Mono

[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
MIT License
527 stars 58 forks source link

CDilated weird ddconv weight shape #121

Closed maxvandenhoven closed 6 months ago

maxvandenhoven commented 6 months ago

Hi @noahzn,

When loading the pretrained weights for the depth encoder, I noticed that the dilated convolution layer has an unexpected shape:

stages.0.0.ddwconv.conv.weight has shape torch.Size([48, 1, 3, 3]), where I would expect it to have shape torch.Size([48, 48, 3, 3]), since the number of channels stays the same in DilatedConv. Do you have any insight on why this is the case?

Thanks!

noahzn commented 6 months ago

Hi, please make sure you are using the correct pre-trained weights and model. You can change the model using --model.

maxvandenhoven commented 6 months ago

I am loading the .pth directly with torch.load for inspection, that is how I noticed.

noahzn commented 6 months ago

Could you load the decoder using my code to see if it works or not?

maxvandenhoven commented 6 months ago

I did some further investigation, and it turns out the dimension of 1 for the kernel is due to the use of groups in the dilated convolution. The code works fine, but I was curious why the kernel weight was shaped like that. Thanks!