Jingkang50 / OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
https://psgdataset.org
MIT License
407 stars 68 forks source link

Questions about the pretrained DETR models for PSGTR and PSGFormer #102

Open Yassin-fan opened 1 year ago

Yassin-fan commented 1 year ago

Thanks for your work! I have successfully installed the framework and run the code (no errors reported). However, I have a similar problem to #33 :during training, the result is always zero. image

According to the logs you provided, I compared and found that my log section does not have anything related to model loading: image

I used the latest code and did not make any changes.

My questions are:: Have you ever encountered this problem?

Is the lack of weight loading content in my logs omitted in the later code or does it not appear due to model loading failure?

The most important question: In what way are the pre-trained models of DETR such as detr_pan_r50.pth and detr4psgformer_r50.pth obtained, and can we train them ourselves? What is the difference between them (in my understanding, both pre-trained models use their Encoder part for global information extraction)

Thanks!

Yassin-fan commented 1 year ago

I found that when loading the DETR pre-trained model “./work_dirs/checkpoints/detr4psgformer_r50.pth” with “psgformer_r50_psg”, there were many missing hints:

missing keys in source state_dict: bbox_head.transformer.decoder2.layers.0.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.0.norms.0.weight, bbox_head.transformer.decoder2.layers.0.norms.0.bias, bbox_head.transformer.decoder2.layers.0.norms.1.weight, bbox_head.transformer.decoder2.layers.0.norms.1.bias, bbox_head.transformer.decoder2.layers.0.norms.2.weight, bbox_head.transformer.decoder2.layers.0.norms.2.bias, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.1.norms.0.weight, bbox_head.transformer.decoder2.layers.1.norms.0.bias, bbox_head.transformer.decoder2.layers.1.norms.1.weight, bbox_head.transformer.decoder2.layers.1.norms.1.bias, bbox_head.transformer.decoder2.layers.1.norms.2.weight, bbox_head.transformer.decoder2.layers.1.norms.2.bias, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.2.norms.0.weight, bbox_head.transformer.decoder2.layers.2.norms.0.bias, bbox_head.transformer.decoder2.layers.2.norms.1.weight, bbox_head.transformer.decoder2.layers.2.norms.1.bias, bbox_head.transformer.decoder2.layers.2.norms.2.weight, bbox_head.transformer.decoder2.layers.2.norms.2.bias, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.3.norms.0.weight, bbox_head.transformer.decoder2.layers.3.norms.0.bias, bbox_head.transformer.decoder2.layers.3.norms.1.weight, bbox_head.transformer.decoder2.layers.3.norms.1.bias, bbox_head.transformer.decoder2.layers.3.norms.2.weight, bbox_head.transformer.decoder2.layers.3.norms.2.bias, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.4.norms.0.weight, bbox_head.transformer.decoder2.layers.4.norms.0.bias, bbox_head.transformer.decoder2.layers.4.norms.1.weight, bbox_head.transformer.decoder2.layers.4.norms.1.bias, bbox_head.transformer.decoder2.layers.4.norms.2.weight, bbox_head.transformer.decoder2.layers.4.norms.2.bias, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.5.norms.0.weight, bbox_head.transformer.decoder2.layers.5.norms.0.bias, bbox_head.transformer.decoder2.layers.5.norms.1.weight, bbox_head.transformer.decoder2.layers.5.norms.1.bias, bbox_head.transformer.decoder2.layers.5.norms.2.weight, bbox_head.transformer.decoder2.layers.5.norms.2.bias, bbox_head.transformer.decoder2.post_norm.weight, bbox_head.transformer.decoder2.post_norm.bias, bbox_head.rel_query_embed.weight, bbox_head.sub_query_update.0.weight, bbox_head.sub_query_update.0.bias, bbox_head.sub_query_update.2.weight, bbox_head.sub_query_update.2.bias, bbox_head.obj_query_update.0.weight, bbox_head.obj_query_update.0.bias, bbox_head.obj_query_update.2.weight, bbox_head.obj_query_update.2.bias, bbox_head.sop_query_update.0.weight, bbox_head.sop_query_update.0.bias, bbox_head.sop_query_update.2.weight, bbox_head.sop_query_update.2.bias, bbox_head.rel_cls_embed.weight, bbox_head.rel_cls_embed.bias

Could this be the problem? I have not modified any code. @Jingkang50

wyz-gitt commented 9 months ago

Hello, have you solved this problem yet?

I found that when loading the DETR pre-trained model “./work_dirs/checkpoints/detr4psgformer_r50.pth” with “psgformer_r50_psg”, there were many missing hints:

missing keys in source state_dict: bbox_head.transformer.decoder2.layers.0.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.0.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.0.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.0.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.0.norms.0.weight, bbox_head.transformer.decoder2.layers.0.norms.0.bias, bbox_head.transformer.decoder2.layers.0.norms.1.weight, bbox_head.transformer.decoder2.layers.0.norms.1.bias, bbox_head.transformer.decoder2.layers.0.norms.2.weight, bbox_head.transformer.decoder2.layers.0.norms.2.bias, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.1.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.1.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.1.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.1.norms.0.weight, bbox_head.transformer.decoder2.layers.1.norms.0.bias, bbox_head.transformer.decoder2.layers.1.norms.1.weight, bbox_head.transformer.decoder2.layers.1.norms.1.bias, bbox_head.transformer.decoder2.layers.1.norms.2.weight, bbox_head.transformer.decoder2.layers.1.norms.2.bias, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.2.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.2.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.2.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.2.norms.0.weight, bbox_head.transformer.decoder2.layers.2.norms.0.bias, bbox_head.transformer.decoder2.layers.2.norms.1.weight, bbox_head.transformer.decoder2.layers.2.norms.1.bias, bbox_head.transformer.decoder2.layers.2.norms.2.weight, bbox_head.transformer.decoder2.layers.2.norms.2.bias, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.3.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.3.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.3.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.3.norms.0.weight, bbox_head.transformer.decoder2.layers.3.norms.0.bias, bbox_head.transformer.decoder2.layers.3.norms.1.weight, bbox_head.transformer.decoder2.layers.3.norms.1.bias, bbox_head.transformer.decoder2.layers.3.norms.2.weight, bbox_head.transformer.decoder2.layers.3.norms.2.bias, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.4.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.4.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.4.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.4.norms.0.weight, bbox_head.transformer.decoder2.layers.4.norms.0.bias, bbox_head.transformer.decoder2.layers.4.norms.1.weight, bbox_head.transformer.decoder2.layers.4.norms.1.bias, bbox_head.transformer.decoder2.layers.4.norms.2.weight, bbox_head.transformer.decoder2.layers.4.norms.2.bias, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.5.attentions.0.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.in_proj_weight, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.in_proj_bias, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.out_proj.weight, bbox_head.transformer.decoder2.layers.5.attentions.1.attn.out_proj.bias, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.0.0.weight, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.0.0.bias, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.1.weight, bbox_head.transformer.decoder2.layers.5.ffns.0.layers.1.bias, bbox_head.transformer.decoder2.layers.5.norms.0.weight, bbox_head.transformer.decoder2.layers.5.norms.0.bias, bbox_head.transformer.decoder2.layers.5.norms.1.weight, bbox_head.transformer.decoder2.layers.5.norms.1.bias, bbox_head.transformer.decoder2.layers.5.norms.2.weight, bbox_head.transformer.decoder2.layers.5.norms.2.bias, bbox_head.transformer.decoder2.post_norm.weight, bbox_head.transformer.decoder2.post_norm.bias, bbox_head.rel_query_embed.weight, bbox_head.sub_query_update.0.weight, bbox_head.sub_query_update.0.bias, bbox_head.sub_query_update.2.weight, bbox_head.sub_query_update.2.bias, bbox_head.obj_query_update.0.weight, bbox_head.obj_query_update.0.bias, bbox_head.obj_query_update.2.weight, bbox_head.obj_query_update.2.bias, bbox_head.sop_query_update.0.weight, bbox_head.sop_query_update.0.bias, bbox_head.sop_query_update.2.weight, bbox_head.sop_query_update.2.bias, bbox_head.rel_cls_embed.weight, bbox_head.rel_cls_embed.bias

Could this be the problem? I have not modified any code. @Jingkang50

Hello, have you solved this problem yet?

jiuxuanth commented 2 months ago

Hello, have you solved this problem yet? I aslo meet the problem