hustvl / Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.55k stars 159 forks source link

加载预训练模型时怎样设置通道数为4而不报错呢 #83

Open BranStarkkk opened 1 month ago

BranStarkkk commented 1 month ago

如题,作者您好,你们的工作贡献令人钦佩。然而我有个问题,在加载预训练模型Vim-Tiny的时候如何能保证设置输入通道数为4而不报错说“与预期的输入通道数3不匹配”的问题呢? 期待您的回复!

BranStarkkk commented 1 month ago

已解决,方法为加载完预训练权重文件之后修改首层的参数: self.model = VisionMamba(pretrained=pretrained, patch_size=16, channels=3) # pretrained=True self.model.patch_embed.proj = nn.Conv2d(channels, 192, kernel_size=(patch_size, patch_size), stride=(8, 8)) # channels=4, patch_size为自己模型设置的batch_size self.model.head = nn.Identity()

NakrAi commented 2 weeks ago

import causal_conv1d_cuda 哥们这个问题你解决没有