bknyaz / sgg

Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.
https://arxiv.org/abs/2007.05756
Other
129 stars 20 forks source link

Problems in pre-trained faster-rcnn detector #9

Open jkli-aa opened 2 years ago

jkli-aa commented 2 years ago

Hi, thank you for sharing these wonderful works!

I found a problem in loading the pre-trained file 'vg-faster-rcnn.tar'. The anchor ratios and anchor scales in neural-motifs are inconsistent with the torchvision.models.detection motifs anchor ratios: (0.23232838, 0.63365731, 1.28478321, 3.15089189); scales: (2.22152954, 4.12315647, 7.21692515, 12.60263013, 22.7102731) torchvision anchor ratios: (0.5, 1.0, 2.0); scales: (32, 64, 128, 256, 512). Thus the pre-trained weights 'vg-faster-rcnn.tar' mismatch the torchvision in rpn.head.bbox_pred (120, 512, 1, 1) vs (60, 512, 1, 1).

I don't know if my analysis above is correct and if this will affect the performance of rpn.

jkli-aa commented 2 years ago

Well, it seems that this repo did not load the weights of rpn.head.bbox_pred. I am confused about whether the detector still works well without the pre-trained rpn. They are important parameters at sgdet.