Closed calebhemara closed 1 year ago
Hi @calebhemara, thanks for your interest in our work.
I do not encounter the error on my end. Are you sure your detectron2
version matches the one we suggest in the installation instructions?
Also, it seems only a warning. Can you share the complete traceback log?
Strange... yes all versions match the documentation.
It is a warning, and the model still outputs a result, but the result is entirely inaccurate (pseudo-random), and I suspect it's to do with incorrect checkpoint key structuring from the .pth/.yaml files...
I am running on Macbook M1 Pro, hence running on cpu, perhaps this is an important factor
Log below, thanks again!
Weight format of OneFormerHead have changed! Please upgrade your models. Applying automatic conversion now ... WARNING [11/26 23:57:35 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint: sem_seg_head.pixel_decoder.adapter_1.norm.{bias, weight} sem_seg_head.pixel_decoder.adapter_1.weight sem_seg_head.pixel_decoder.input_proj.0.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.0.1.{bias, weight} sem_seg_head.pixel_decoder.input_proj.1.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.1.1.{bias, weight} sem_seg_head.pixel_decoder.input_proj.2.0.{bias, weight} sem_seg_head.pixel_decoder.input_proj.2.1.{bias, weight} sem_seg_head.pixel_decoder.layer_1.norm.{bias, weight} sem_seg_head.pixel_decoder.layer_1.weight sem_seg_head.pixel_decoder.mask_features.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.0.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.1.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.2.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.3.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.4.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.linear1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.linear2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.norm1.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.norm2.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.transformer.encoder.layers.5.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.transformer.level_embed WARNING [11/26 23:57:35 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model: prompt_ctx.weight text_encoder.ln_final.{bias, weight} text_encoder.positional_embedding text_encoder.token_embedding.weight text_encoder.transformer.resblocks.0.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.0.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.0.ln_1.{bias, weight} text_encoder.transformer.resblocks.0.ln_2.{bias, weight} text_encoder.transformer.resblocks.0.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.0.mlp.c_proj.{bias, weight} text_encoder.transformer.resblocks.1.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.1.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.1.ln_1.{bias, weight} text_encoder.transformer.resblocks.1.ln_2.{bias, weight} text_encoder.transformer.resblocks.1.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.1.mlp.c_proj.{bias, weight} text_encoder.transformer.resblocks.2.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.2.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.2.ln_1.{bias, weight} text_encoder.transformer.resblocks.2.ln_2.{bias, weight} text_encoder.transformer.resblocks.2.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.2.mlp.c_proj.{bias, weight} text_encoder.transformer.resblocks.3.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.3.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.3.ln_1.{bias, weight} text_encoder.transformer.resblocks.3.ln_2.{bias, weight} text_encoder.transformer.resblocks.3.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.3.mlp.c_proj.{bias, weight} text_encoder.transformer.resblocks.4.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.4.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.4.ln_1.{bias, weight} text_encoder.transformer.resblocks.4.ln_2.{bias, weight} text_encoder.transformer.resblocks.4.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.4.mlp.c_proj.{bias, weight} text_encoder.transformer.resblocks.5.attn.{in_proj_bias, in_proj_weight} text_encoder.transformer.resblocks.5.attn.out_proj.{bias, weight} text_encoder.transformer.resblocks.5.ln_1.{bias, weight} text_encoder.transformer.resblocks.5.ln_2.{bias, weight} text_encoder.transformer.resblocks.5.mlp.c_fc.{bias, weight} text_encoder.transformer.resblocks.5.mlp.c_proj.{bias, weight} text_projector.layers.0.{bias, weight} text_projector.layers.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.adapter_1.norm.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.adapter_1.weight sem_seg_head.pixel_decoder.pixel_decoder.input_proj.0.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.0.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.1.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.1.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.2.0.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.input_proj.2.1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.layer_1.norm.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.layer_1.weight sem_seg_head.pixel_decoder.pixel_decoder.mask_features.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.1.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.2.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.3.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.4.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.linear1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.linear2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.norm1.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.norm2.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.attention_weights.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.output_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.sampling_offsets.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.5.self_attn.value_proj.{bias, weight} sem_seg_head.pixel_decoder.pixel_decoder.transformer.level_embed
Hi @calebhemara, did you change the config file? From your shared log, it seems like you are using the wrong Pixel Decoder.
Please confirm if the values in the config file are correct.
Thanks. No changes to config file/s. I've done a fresh clone (steps below), and still have the same error!
1) Cloned repo
2) Download .yaml and corresponding checkpoint
3) Move checkpoint to repo directory, move .yaml to ./configs/ade20k/dinat
4) Add line to demo.py at end of def setup_cfg(args):
cfg.MODEL.DEVICE = 'cpu'
5) Ran script
python demo.py --config-file ./configs/ade20k/dinat/oneformer_dinat_large_bs16_160k_896x896.yaml \ --input ./IMG_8617_C.jpg \ --output ./IMG_8617_C_S.jpg \ --task 'semantic' \ --opts MODEL.IS_TRAIN False MODEL.IS_DEMO True MODEL.WEIGHTS ./896x896_250_16_dinat_l_oneformer_ade20k_160k.pth
Thanks again
Hi @calebhemara, I have two questions for you:
cfg.MODEL.SEM_SEG_HEAD.PIXEL_DECODER_NAME
and cfg.MODEL.ONE_FORMER.TRANSFORMER_IN_FEATURE
and share those with me here? They should match the ones here.Thanks @praeclarumjj3
print(cfg.MODEL.SEM_SEG_HEAD.PIXEL_DECODER_NAME, cfg.MODEL.ONE_FORMER.TRANSFORMER_IN_FEATURE)
MSDeformAttnPixelDecoder multi_scale_pixel_decoder
I'll keep doing some homework and hopefully figure it out.
The configuration seems correct. Did you try a different config file?
Thanks @praeclarumjj3 , I've tried a Convnext, Swin and Dinat cfg and .pth. Perhaps it could have something to do with the lack of CUDA GPU on M1 Pro system. I've tried to find the checkpoint load route for setting to device to torch.device('cpu'), because my suspicion is that the .pth save is defaulting to loading on torch.device('cuda:0').
Still no luck 🤷♂️
I don't think it has anything related to the availability of the CUDA GPU. In our colab demo, we load the models on the CPU.
I think I know where the issue is. When you are loading the checkpoint, it's reading the keys for pixel_decoder
as: sem_seg_head.pixel_decoder.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight}
.
Instead, it should be sem_seg_head.pixel_decoder.transformer.encoder.layers.0.linear2.{bias, weight}
. There's an extra pixel_decoder.
in the keys. Did you change the checkpoint after downloading? This is indeed strange.
I am running inference on GPU. The issue crops up coz of the different installation methods used.
Initially, I used torch 1.9.0@cu113
and detectron2
+ natten
compiled on cu113
. It threw the same errors as mentioned. I used cog
containers to build the environment and run inference. I moved to the installation mentioned in this colab repo for OneFormer. Was able to do inference just fine.
Using Swin backbone and aedk20k
dataset config.
Thanks for your comment, @pratos. @calebhemara, were you able to solve this issue?
Closing this issue for now. Feel free to re-open if you face any more issues.
Thanks for your incredible work team. Getting this error on inference:
Weight format of OneFormerHead have changed! Please upgrade your models. Applying automatic conversion now ... WARNING [11/26 13:32:01 fvcore.common.checkpoint]: Some model parameters or buffers are not found in the checkpoint:
Using this model for config & checkpoint: OneFormer | DiNAT-L† | 896×896
I am running inference on cpu with:
cfg.MODEL.DEVICE = 'cpu'
Any idea where I'm going wrong would be greatly appreciated. Thanks !