Describe the bug
I'm getting "UserWarning: position encoding of key ismissing in MultiheadAttention" when I run inference or try to fine-tune MMGroundingDINO. Is that expected?
Reproduction
$ python demo/image_demo.py image.jpg configs/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det.py --weights grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth --texts 'bench . car . person . orange . bicycle .'
Loads checkpoint by local backend from path: /home/user/mmdetection/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth
The model and loaded state dict do not match exactly
unexpected key in source state_dict: language_model.language_backbone.body.model.embeddings.position_ids
03/18 14:21:57 - mmengine - WARNING - Failed to search registry with scope "mmdet" in the "function" registry tree. As a workaround, the current "function" registry in "mmengine" is used to build instance. This may cause unexpected failure when running the built modules. Please check whether "mmdet" is a correct scope, or whether the registry is initialized.
/home/user/.pyenv/versions/mmdet/lib/python3.10/site-packages/mmengine/visualization/visualizer.py:196: UserWarning: Failed to add <class 'mmengine.visualization.vis_backend.LocalVisBackend'>, please provide the `save_dir` argument.
warnings.warn(f'Failed to add {vis_backend.__class__}, '
[nltk_data] Downloading package punkt to ~/nltk_data...
[nltk_data] Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data] ~/nltk_data...
[nltk_data] Package averaged_perceptron_tagger is already up-to-
[nltk_data] date!
noun_phrases: ['bench', 'car', 'person', 'orange', 'bicycle']
/home/user/.pyenv/versions/mmdet/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/home/user/.pyenv/versions/mmdet/lib/python3.10/site-packages/mmcv/cnn/bricks/transformer.py:524: UserWarning: position encoding of key ismissing in MultiheadAttention.
warnings.warn(f'position encoding of key is'
Describe the bug I'm getting "UserWarning: position encoding of key ismissing in MultiheadAttention" when I run inference or try to fine-tune MMGroundingDINO. Is that expected?
Reproduction
Environment
Bug fix I'm not sure if this is expected, but this is coming form the
MultiheadAttention
module defined inmmcv/cnn/bricks/transformer.py
.