Jingkang50 / OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
https://psgdataset.org
MIT License
407 stars 68 forks source link

Loading pretrained model: The model and loaded state dict do not match exactly. #68

Closed pingapplepen closed 1 year ago

pingapplepen commented 1 year ago

Hi I am testing the model with pretrained model provided by you and I have the following warning: creating index... index created! load checkpoint from local path: work_dirs/checkpoints/detr_pan_r50.pth The model and loaded state dict do not match exactly

unexpected key in source state_dict: bbox_head.class_embed.weight, bbox_head.class_embed.bias, bbox_head.bbox_embed.layers.0.weight, bbox_head.bbox_embed.layers.0.bias, bbox_head.bbox_embed.layers.1.weight, bbox_head.bbox_embed.layers.1.bias, bbox_head.bbox_embed.layers.2.weight, bbox_head.bbox_embed.layers.2.bias, bbox_head.bbox_attention.q_linear.weight, bbox_head.bbox_attention.q_linear.bias, bbox_head.bbox_attention.k_linear.weight, bbox_head.bbox_attention.k_linear.bias, bbox_head.mask_head.lay1.weight, bbox_head.mask_head.lay1.bias, bbox_head.mask_head.gn1.weight, bbox_head.mask_head.gn1.bias, bbox_head.mask_head.lay2.weight, bbox_head.mask_head.lay2.bias, bbox_head.mask_head.gn2.weight, bbox_head.mask_head.gn2.bias, bbox_head.mask_head.lay3.weight, bbox_head.mask_head.lay3.bias, bbox_head.mask_head.gn3.weight, bbox_head.mask_head.gn3.bias, bbox_head.mask_head.lay4.weight, bbox_head.mask_head.lay4.bias, bbox_head.mask_head.gn4.weight, bbox_head.mask_head.gn4.bias, bbox_head.mask_head.lay5.weight, bbox_head.mask_head.lay5.bias, bbox_head.mask_head.gn5.weight, bbox_head.mask_head.gn5.bias, bbox_head.mask_head.out_lay.weight, bbox_head.mask_head.out_lay.bias, bbox_head.mask_head.adapter1.weight, bbox_head.mask_head.adapter1.bias, bbox_head.mask_head.adapter2.weight, bbox_head.mask_head.adapter2.bias, bbox_head.mask_head.adapter3.weight, bbox_head.mask_head.adapter3.bias

missing keys in source state_dict: bbox_head.obj_cls_embed.weight, bbox_head.obj_cls_embed.bias, bbox_head.obj_box_embed.layers.0.weight, bbox_head.obj_box_embed.layers.0.bias, bbox_head.obj_box_embed.layers.1.weight, bbox_head.obj_box_embed.layers.1.bias, bbox_head.obj_box_embed.layers.2.weight, bbox_head.obj_box_embed.layers.2.bias, bbox_head.sub_cls_embed.weight, bbox_head.sub_cls_embed.bias, bbox_head.sub_box_embed.layers.0.weight, bbox_head.sub_box_embed.layers.0.bias, bbox_head.sub_box_embed.layers.1.weight, bbox_head.sub_box_embed.layers.1.bias, bbox_head.sub_box_embed.layers.2.weight, bbox_head.sub_box_embed.layers.2.bias, bbox_head.rel_cls_embed.weight, bbox_head.rel_cls_embed.bias, bbox_head.sub_bbox_attention.q_linear.weight, bbox_head.sub_bbox_attention.q_linear.bias, bbox_head.sub_bbox_attention.k_linear.weight, bbox_head.sub_bbox_attention.k_linear.bias, bbox_head.obj_bbox_attention.q_linear.weight, bbox_head.obj_bbox_attention.q_linear.bias, bbox_head.obj_bbox_attention.k_linear.weight, bbox_head.obj_bbox_attention.k_linear.bias, bbox_head.sub_mask_head.lay1.weight, bbox_head.sub_mask_head.lay1.bias, bbox_head.sub_mask_head.gn1.weight, bbox_head.sub_mask_head.gn1.bias, bbox_head.sub_mask_head.lay2.weight, bbox_head.sub_mask_head.lay2.bias, bbox_head.sub_mask_head.gn2.weight, bbox_head.sub_mask_head.gn2.bias, bbox_head.sub_mask_head.lay3.weight, bbox_head.sub_mask_head.lay3.bias, bbox_head.sub_mask_head.gn3.weight, bbox_head.sub_mask_head.gn3.bias, bbox_head.sub_mask_head.lay4.weight, bbox_head.sub_mask_head.lay4.bias, bbox_head.sub_mask_head.gn4.weight, bbox_head.sub_mask_head.gn4.bias, bbox_head.sub_mask_head.lay5.weight, bbox_head.sub_mask_head.lay5.bias, bbox_head.sub_mask_head.gn5.weight, bbox_head.sub_mask_head.gn5.bias, bbox_head.sub_mask_head.out_lay.weight, bbox_head.sub_mask_head.out_lay.bias, bbox_head.sub_mask_head.adapter1.weight, bbox_head.sub_mask_head.adapter1.bias, bbox_head.sub_mask_head.adapter2.weight, bbox_head.sub_mask_head.adapter2.bias, bbox_head.sub_mask_head.adapter3.weight, bbox_head.sub_mask_head.adapter3.bias, bbox_head.obj_mask_head.lay1.weight, bbox_head.obj_mask_head.lay1.bias, bbox_head.obj_mask_head.gn1.weight, bbox_head.obj_mask_head.gn1.bias, bbox_head.obj_mask_head.lay2.weight, bbox_head.obj_mask_head.lay2.bias, bbox_head.obj_mask_head.gn2.weight, bbox_head.obj_mask_head.gn2.bias, bbox_head.obj_mask_head.lay3.weight, bbox_head.obj_mask_head.lay3.bias, bbox_head.obj_mask_head.gn3.weight, bbox_head.obj_mask_head.gn3.bias, bbox_head.obj_mask_head.lay4.weight, bbox_head.obj_mask_head.lay4.bias, bbox_head.obj_mask_head.gn4.weight, bbox_head.obj_mask_head.gn4.bias, bbox_head.obj_mask_head.lay5.weight, bbox_head.obj_mask_head.lay5.bias, bbox_head.obj_mask_head.gn5.weight, bbox_head.obj_mask_head.gn5.bias, bbox_head.obj_mask_head.out_lay.weight, bbox_head.obj_mask_head.out_lay.bias, bbox_head.obj_mask_head.adapter1.weight, bbox_head.obj_mask_head.adapter1.bias, bbox_head.obj_mask_head.adapter2.weight, bbox_head.obj_mask_head.adapter2.bias, bbox_head.obj_mask_head.adapter3.weight, bbox_head.obj_mask_head.adapter3.bias

Though the testing itself is smooth, the result shows:

SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=phrdet, type=Mean Recall. SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=sgdet, type=NoGraphConstraint @ 56 Mean Recall. SGG eval: mR @ 20: 0.0000; mR @ 50: 0.0000; mR @ 100: 0.0000; for mode=phrdet, type=NoGraphConstraint @ 56 Mean Recall.

{'sgdet_recall_R_20': nan, 'sgdet_recall_R_50': nan, 'sgdet_recall_R_100': nan, 'sgdet_mean_recall_mR_20': 0.0, 'sgdet_mean_recall_mR_50': 0.0, 'sgdet_mean_recall_mR_100': 0.0, 'sgdet_copystat': 'sgdet_recall_R_20: nan\nsgdet_recall_R_50: nan\nsgdet_recall_R_100: nan\nsgdet_mean_recall_mR_20: 0.000\nsgdet_mean_recall_mR_50: 0.000\nsgdet_mean_recall_mR_100: 0.000\n', 'sgdet_runtime_eval_str' and | over | 0.0000 | in front of | 0.0000 | beside | 0.0000 |\n| on | 0.0000 | in | 0.0000 | attached to | 0.0000 |\n| hanging from | 0.0000 | on back of | 0.0000 | falling off | 0.0000 |\n| going down | 0.0000 | painted on | 0.0000 | walking on | 0.0000 |\n| running on | 0.0000 | crossing | 0.0000 | standing on | 0.0000 |\n| lying on | 0.0000 | sitting on | 0.0000 | flying over | 0.0000 |\n| jumping over | 0.0000 | jumping from | 0.0000 | wearing | 0.0000 |\n| holding | 0.0000 | carrying | 0.0000 | looking at | 0.0000 |\n| guiding | 0.0000 | kissing | 0.0000 | eating | 0.0000 |\n| drinking | 0.0000 | feeding | 0.0000 | biting | 0.0000 |\n| catching | 0.0000 | picking | 0.0000 | playing with | 0.0000 |\n| chasing | 0.0000 | climbing | 0.0000 | cleaning | 0.0000 |\n| playing | 0.0000 | touching | 0.0000 | pushing | 0.0000 |\n| pulling | 0.0000 | opening | 0.0000 | cooking | 0.0000 |\n| talking to | 0.0000 | throwing | 0.0000 | slicing | 0.0000 |\n| driving | 0.0000 | riding | 0.0000 | parked on | 0.0000 |\n| driving on | 0.0000 | about to hit | 0.0000 | kicking | 0.0000 |\n| swinging | 0.0000 | entering | 0.0000 | exiting | 0.0000 |\n| enclosing | 0.0000 | leaning on | 0.0000 | None | None. All results are zeroes. `

During the testing, I notice that there are warnings in the log, not too sure if it's related to the zeroes issue:

home/anaconda3/envs/detectron2/lib/python3.7/site-packages/mmdet/models/utils/positional_encoding.py:81: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). dim_t = self.temperature**(2 * (dim_t // 2) / self.num_feats) /home/OpenPSG/openpsg/models/relation_heads/psgtr_head.py:940: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). triplet_index = r_indexes // self.num_relations /home/anaconda3/envs/detectron2/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3441: RuntimeWarning: Mean of empty slice. out=out, **kwargs) /home/anaconda3/envs/detectron2/lib/python3.7/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount)