Open lifuguan opened 1 year ago
Hi @lifuguan,
Thanks a lot for your feedback!
We checked the code and found the problem, but some of the mismatched keys were intentional. The problem is about the name mismatch between cpb_mlp
in the released model and rpe_mlp
in the provided checkpoints. Luckily we have fixed this issue and you could pull the recent updates to check it.
However, the missing relative_coords_table
and relative_position_index
are intentional. Since the input resolution is set to 192 during pre-training and 224 during fine-tuning, we will just delete these two params in loading pre-trained models to avoid loading into wrong shaped params. And the model will generate the right params. You could refer to here for more details:
https://github.com/microsoft/Swin-Transformer/blob/main/utils_simmim.py#L189 https://github.com/microsoft/Swin-Transformer/blob/main/models/swin_transformer_v2.py#L97
So currently, after pulling the latest repo, the IncompatibleKeys prompt you get when fine-tuning the provided pre-trained models should look like this:
[2022-09-29 16:01:49 simmim_finetune](utils_simmim.py 119): INFO _IncompatibleKeys(missing_keys=['layers.0.blocks.0.attn.relative_coords_table', 'layers.0.blocks.0.attn.relative_position_index', 'layers.0.blocks.1.attn_mask', 'layers.0.blocks.1.attn.relative_coords_table', 'layers.0.blocks.1.attn.relative_position_index', 'layers.1.blocks.0.attn.relative_coords_table', 'layers.1.blocks.0.attn.relative_position_index', 'layers.1.blocks.1.attn_mask', 'layers.1.blocks.1.attn.relative_coords_table', 'layers.1.blocks.1.attn.relative_position_index', 'layers.2.blocks.0.attn.relative_coords_table', 'layers.2.blocks.0.attn.relative_position_index', 'layers.2.blocks.1.attn.relative_coords_table', 'layers.2.blocks.1.attn.relative_position_index', 'layers.2.blocks.2.attn.relative_coords_table', 'layers.2.blocks.2.attn.relative_position_index', 'layers.2.blocks.3.attn.relative_coords_table', 'layers.2.blocks.3.attn.relative_position_index', 'layers.2.blocks.4.attn.relative_coords_table', 'layers.2.blocks.4.attn.relative_position_index', 'layers.2.blocks.5.attn.relative_coords_table', 'layers.2.blocks.5.attn.relative_position_index', 'layers.2.blocks.6.attn.relative_coords_table', 'layers.2.blocks.6.attn.relative_position_index', 'layers.2.blocks.7.attn.relative_coords_table', 'layers.2.blocks.7.attn.relative_position_index', 'layers.2.blocks.8.attn.relative_coords_table', 'layers.2.blocks.8.attn.relative_position_index', 'layers.2.blocks.9.attn.relative_coords_table', 'layers.2.blocks.9.attn.relative_position_index', 'layers.2.blocks.10.attn.relative_coords_table', 'layers.2.blocks.10.attn.relative_position_index', 'layers.2.blocks.11.attn.relative_coords_table', 'layers.2.blocks.11.attn.relative_position_index', 'layers.2.blocks.12.attn.relative_coords_table', 'layers.2.blocks.12.attn.relative_position_index', 'layers.2.blocks.13.attn.relative_coords_table', 'layers.2.blocks.13.attn.relative_position_index', 'layers.2.blocks.14.attn.relative_coords_table', 'layers.2.blocks.14.attn.relative_position_index', 'layers.2.blocks.15.attn.relative_coords_table', 'layers.2.blocks.15.attn.relative_position_index', 'layers.2.blocks.16.attn.relative_coords_table', 'layers.2.blocks.16.attn.relative_position_index', 'layers.2.blocks.17.attn.relative_coords_table', 'layers.2.blocks.17.attn.relative_position_index', 'layers.3.blocks.0.attn.relative_coords_table', 'layers.3.blocks.0.attn.relative_position_index', 'layers.3.blocks.1.attn.relative_coords_table', 'layers.3.blocks.1.attn.relative_position_index', 'head.weight', 'head.bias'], unexpected_keys=['mask_token'])
Hope this will solve your problem.
First of all, thanks a lot for solving the problem of loading pre-trained models, but a new problem appeared, swin_v2 doesn't seem to work when the tested images are of different resolutions.
I tried to finetune SimMIM pre-trained Swin-V2 model following the get_started.md:
The link of pre-trained weight is shown below: https://msravcghub.blob.core.windows.net/simmim-release/swinv2/pretrain/swinv2_base_22k_125k.pth
Problem report
Unfortunately, the logger shows that there are many Incompatible Keys when loading the pre-trained weight file:
It means the pre-trained file is not loaded properly.
Weight visualization
Furthermore, I double checked the weight keys of the pre-trained file using the following code:
It shows the keys are Incompatible with the expected model:
encoder.
is unwanted.I will be appreciated if it can be fixed soon!