MzeroMiko / VMamba

VMamba: Visual State Space Models,code is based on mamba
MIT License
1.82k stars 98 forks source link

About the pretrained classification models. #238

Closed aifeixingdelv closed 1 week ago

aifeixingdelv commented 1 week ago

May be,the pretrained classification models in the readme has some errors? When I make the Backbone_VSSM class ,create vmamba-tiny model and load the pretrained weight from readme, this log is given:

Successfully load ckpt models/vssm1_tiny_0230s_ckpt_epoch_264.pth
Failed loading checkpoint form models/vssm1_tiny_0230s_ckpt_epoch_264.pth: Error(s) in loading state_dict for MambaEncoder:
    size mismatch for layers.0.blocks.0.op.x_proj_weight: copying a param with shape torch.Size([4, 8, 96]) from checkpoint, the shape in current model is torch.Size([4, 8, 192]).
    size mismatch for layers.0.blocks.0.op.A_logs: copying a param with shape torch.Size([384, 1]) from checkpoint, the shape in current model is torch.Size([768, 1]).
    size mismatch for layers.0.blocks.0.op.Ds: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.0.blocks.0.op.dt_projs_weight: copying a param with shape torch.Size([4, 96, 6]) from checkpoint, the shape in current model is torch.Size([4, 192, 6]).
    size mismatch for layers.0.blocks.0.op.dt_projs_bias: copying a param with shape torch.Size([4, 96]) from checkpoint, the shape in current model is torch.Size([4, 192]).
    size mismatch for layers.0.blocks.0.op.out_norm.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for layers.0.blocks.0.op.out_norm.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for layers.0.blocks.0.op.in_proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([384, 96]).
    size mismatch for layers.0.blocks.0.op.conv2d.weight: copying a param with shape torch.Size([96, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([192, 1, 3, 3]).
    size mismatch for layers.0.blocks.0.op.out_proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([96, 192]).
    size mismatch for layers.0.blocks.1.op.x_proj_weight: copying a param with shape torch.Size([4, 8, 96]) from checkpoint, the shape in current model is torch.Size([4, 8, 192]).
    size mismatch for layers.0.blocks.1.op.A_logs: copying a param with shape torch.Size([384, 1]) from checkpoint, the shape in current model is torch.Size([768, 1]).
    size mismatch for layers.0.blocks.1.op.Ds: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.0.blocks.1.op.dt_projs_weight: copying a param with shape torch.Size([4, 96, 6]) from checkpoint, the shape in current model is torch.Size([4, 192, 6]).
    size mismatch for layers.0.blocks.1.op.dt_projs_bias: copying a param with shape torch.Size([4, 96]) from checkpoint, the shape in current model is torch.Size([4, 192]).
    size mismatch for layers.0.blocks.1.op.out_norm.weight: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for layers.0.blocks.1.op.out_norm.bias: copying a param with shape torch.Size([96]) from checkpoint, the shape in current model is torch.Size([192]).
    size mismatch for layers.0.blocks.1.op.in_proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([384, 96]).
    size mismatch for layers.0.blocks.1.op.conv2d.weight: copying a param with shape torch.Size([96, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([192, 1, 3, 3]).
    size mismatch for layers.0.blocks.1.op.out_proj.weight: copying a param with shape torch.Size([96, 96]) from checkpoint, the shape in current model is torch.Size([96, 192]).
    size mismatch for layers.1.blocks.0.op.x_proj_weight: copying a param with shape torch.Size([4, 14, 192]) from checkpoint, the shape in current model is torch.Size([4, 14, 384]).
    size mismatch for layers.1.blocks.0.op.A_logs: copying a param with shape torch.Size([768, 1]) from checkpoint, the shape in current model is torch.Size([1536, 1]).
    size mismatch for layers.1.blocks.0.op.Ds: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.1.blocks.0.op.dt_projs_weight: copying a param with shape torch.Size([4, 192, 12]) from checkpoint, the shape in current model is torch.Size([4, 384, 12]).
    size mismatch for layers.1.blocks.0.op.dt_projs_bias: copying a param with shape torch.Size([4, 192]) from checkpoint, the shape in current model is torch.Size([4, 384]).
    size mismatch for layers.1.blocks.0.op.out_norm.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for layers.1.blocks.0.op.out_norm.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for layers.1.blocks.0.op.in_proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([768, 192]).
    size mismatch for layers.1.blocks.0.op.conv2d.weight: copying a param with shape torch.Size([192, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 1, 3, 3]).
    size mismatch for layers.1.blocks.0.op.out_proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([192, 384]).
    size mismatch for layers.1.blocks.1.op.x_proj_weight: copying a param with shape torch.Size([4, 14, 192]) from checkpoint, the shape in current model is torch.Size([4, 14, 384]).
    size mismatch for layers.1.blocks.1.op.A_logs: copying a param with shape torch.Size([768, 1]) from checkpoint, the shape in current model is torch.Size([1536, 1]).
    size mismatch for layers.1.blocks.1.op.Ds: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.1.blocks.1.op.dt_projs_weight: copying a param with shape torch.Size([4, 192, 12]) from checkpoint, the shape in current model is torch.Size([4, 384, 12]).
    size mismatch for layers.1.blocks.1.op.dt_projs_bias: copying a param with shape torch.Size([4, 192]) from checkpoint, the shape in current model is torch.Size([4, 384]).
    size mismatch for layers.1.blocks.1.op.out_norm.weight: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for layers.1.blocks.1.op.out_norm.bias: copying a param with shape torch.Size([192]) from checkpoint, the shape in current model is torch.Size([384]).
    size mismatch for layers.1.blocks.1.op.in_proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([768, 192]).
    size mismatch for layers.1.blocks.1.op.conv2d.weight: copying a param with shape torch.Size([192, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([384, 1, 3, 3]).
    size mismatch for layers.1.blocks.1.op.out_proj.weight: copying a param with shape torch.Size([192, 192]) from checkpoint, the shape in current model is torch.Size([192, 384]).
    size mismatch for layers.2.blocks.0.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.0.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.0.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.0.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.0.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.0.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.0.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.0.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.0.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.0.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.1.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.1.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.1.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.1.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.1.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.1.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.1.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.1.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.1.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.1.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.2.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.2.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.2.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.2.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.2.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.2.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.2.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.2.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.2.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.2.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.3.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.3.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.3.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.3.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.3.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.3.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.3.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.3.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.3.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.3.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.4.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.4.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.4.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.4.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.4.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.4.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.4.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.4.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.4.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.4.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.5.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.5.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.5.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.5.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.5.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.5.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.5.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.5.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.5.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.5.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.6.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.6.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.6.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.6.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.6.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.6.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.6.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.6.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.6.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.6.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.2.blocks.7.op.x_proj_weight: copying a param with shape torch.Size([4, 26, 384]) from checkpoint, the shape in current model is torch.Size([4, 26, 768]).
    size mismatch for layers.2.blocks.7.op.A_logs: copying a param with shape torch.Size([1536, 1]) from checkpoint, the shape in current model is torch.Size([3072, 1]).
    size mismatch for layers.2.blocks.7.op.Ds: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]).
    size mismatch for layers.2.blocks.7.op.dt_projs_weight: copying a param with shape torch.Size([4, 384, 24]) from checkpoint, the shape in current model is torch.Size([4, 768, 24]).
    size mismatch for layers.2.blocks.7.op.dt_projs_bias: copying a param with shape torch.Size([4, 384]) from checkpoint, the shape in current model is torch.Size([4, 768]).
    size mismatch for layers.2.blocks.7.op.out_norm.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.7.op.out_norm.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]).
    size mismatch for layers.2.blocks.7.op.in_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([1536, 384]).
    size mismatch for layers.2.blocks.7.op.conv2d.weight: copying a param with shape torch.Size([384, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([768, 1, 3, 3]).
    size mismatch for layers.2.blocks.7.op.out_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([384, 768]).
    size mismatch for layers.3.blocks.0.op.x_proj_weight: copying a param with shape torch.Size([4, 50, 768]) from checkpoint, the shape in current model is torch.Size([4, 50, 1536]).
    size mismatch for layers.3.blocks.0.op.A_logs: copying a param with shape torch.Size([3072, 1]) from checkpoint, the shape in current model is torch.Size([6144, 1]).
    size mismatch for layers.3.blocks.0.op.Ds: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([6144]).
    size mismatch for layers.3.blocks.0.op.dt_projs_weight: copying a param with shape torch.Size([4, 768, 48]) from checkpoint, the shape in current model is torch.Size([4, 1536, 48]).
    size mismatch for layers.3.blocks.0.op.dt_projs_bias: copying a param with shape torch.Size([4, 768]) from checkpoint, the shape in current model is torch.Size([4, 1536]).
    size mismatch for layers.3.blocks.0.op.out_norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.3.blocks.0.op.out_norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.3.blocks.0.op.in_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
    size mismatch for layers.3.blocks.0.op.conv2d.weight: copying a param with shape torch.Size([768, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1536, 1, 3, 3]).
    size mismatch for layers.3.blocks.0.op.out_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([768, 1536]).
    size mismatch for layers.3.blocks.1.op.x_proj_weight: copying a param with shape torch.Size([4, 50, 768]) from checkpoint, the shape in current model is torch.Size([4, 50, 1536]).
    size mismatch for layers.3.blocks.1.op.A_logs: copying a param with shape torch.Size([3072, 1]) from checkpoint, the shape in current model is torch.Size([6144, 1]).
    size mismatch for layers.3.blocks.1.op.Ds: copying a param with shape torch.Size([3072]) from checkpoint, the shape in current model is torch.Size([6144]).
    size mismatch for layers.3.blocks.1.op.dt_projs_weight: copying a param with shape torch.Size([4, 768, 48]) from checkpoint, the shape in current model is torch.Size([4, 1536, 48]).
    size mismatch for layers.3.blocks.1.op.dt_projs_bias: copying a param with shape torch.Size([4, 768]) from checkpoint, the shape in current model is torch.Size([4, 1536]).
    size mismatch for layers.3.blocks.1.op.out_norm.weight: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.3.blocks.1.op.out_norm.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([1536]).
    size mismatch for layers.3.blocks.1.op.in_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([3072, 768]).
    size mismatch for layers.3.blocks.1.op.conv2d.weight: copying a param with shape torch.Size([768, 1, 3, 3]) from checkpoint, the shape in current model is torch.Size([1536, 1, 3, 3]).
    size mismatch for layers.3.blocks.1.op.out_proj.weight: copying a param with shape torch.Size([768, 768]) from checkpoint, the shape in current model is torch.Size([768, 1536]).

My config is the same with your config from the classifcation part of readme: patch_size=4, in_chans=3, num_classes=1000, depths=[2, 2, 8, 2], dims=[96, 192, 384, 768], ssm_d_state=1, ssm_ratio=2.0, ssm_dt_rank="auto", ssm_act_layer="silu", ssm_conv=3, ssm_conv_bias=False, ssm_drop_rate=0.0, ssm_init="v0", forward_type="v05", mlp_ratio=4.0, mlp_act_layer="gelu", mlp_drop_rate=0.0, gmlp=False, drop_path_rate=0.1, patch_norm=True, norm_layer="LN", # "BN", "LN2D" downsample_version="v3", # "v1", "v2", "v3" patchembed_version="v2", # "v1", "v2" use_checkpoint=False, posembed=False, imgsize=224,

aifeixingdelv commented 1 week ago

Ev