facebookresearch / Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
MIT License
2.55k stars 385 forks source link

Some model parameters or buffers are not found in the checkpoint when testing the model #95

Open krumo opened 2 years ago

krumo commented 2 years ago

Hi, I am trying to evaluate the model trained on Cityscapes by running python train_net.py --config-file configs/cityscapes/panoptic-segmentation/maskformer2_R50_bs16_90k.yaml --eval-only MODEL.WEIGHTS exp/model_final.pth. However, when I load the model I get the following warning:

WARNING [05/12 17:01:40 mask2former.modeling.meta_arch.mask_former_head]: Weight format of MaskFormerHead have changed! Please upgrade your models. Applying automatic conversion now ... WARNING [05/12 17:01:40 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint: sem_seg_head.pixel_decoder.adapter_1.norm.{bias, weight} sem_seg_head.pixel_decoder.adapter_1.weight sem_seg_head.pixel_decoder.input_proj.0.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.0.1.{bias, weight} sem_seg_head.pixel_decoder.input_proj.1.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.1.1.{bias, weight} sem_seg_head.pixel_decoder.input_proj.2.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.2.1.{bias, weight} sem_seg_head.pixel_decoder.layer_1.norm.{bias, weight} sem_seg_head.pixel_decoder.layer_1.weight sem_seg_head.pixel_decoder.mask_features.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.level_embed WARNING [05/12 17:01:40 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model: sem_seg_head.pixel_decoder.pixel_decoder.adapter_1.norm.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.adapter_1.weight sem_seg_head.pixel_decoder.pixel_decoder.input_proj.0.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.0.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.1.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.1.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.2.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.2.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.layer_1.norm.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.layer_1.weight sem_seg_head.pixel_decoder.pixel_decoder.mask_features.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.level_embed

And the test results look weird. The PQ, SQ, RQ are all 0s. Would you mind sharing some suggestions?

lcy0604 commented 2 years ago

i also meet with this problem, have you solved this issue?

heshuting555 commented 2 years ago

I also meet with this problem, you can degrade the detectron2 version to a past version, e.g. 932f25ad38768dc88cc032f3db57ce30d2bea9c1 is okay for me.

SantiUsma commented 2 years ago

I solved this problem. Note that all the keys in the checkpoint have the same name in the model but with a "pixel_decoder" additional. In the file mask2former/modeling/meta_arch/mask_former_head.py comment/erase all the line 33-36 (if "sem_seg_head" in k and not k.startswith(prefix + "predictor"):) this lines add a "pixel_decoder" to the checkpoints keys.

dickrsunny commented 1 year ago

Just return in "_load_from_state_dict" in the file mask2former/modeling/meta_arch/mask_former_head.py works for me.