microsoft / scene_graph_benchmark

image scene graph generation benchmark
MIT License
385 stars 86 forks source link

Res152c4 on 4 datasets seems not right #23

Open helson73 opened 3 years ago

helson73 commented 3 years ago

VinVL's DOWNLOAD.md says We also provide the X152-C4 objecte detection config file and pretrained model on the merged four datasets (COCO with stuff, Visual Genome, Objects365 and Open Images). The labelmap to decode the 1848 can be found here. The first 1594 classes are exactly VG classes, with the same order. The map from COCO vocabulary to this merged vocabulary can be found here. The map from Objects365 vocabulary to this merged vocabulary can be found here. The map from OpenImages V5 vocabulary to this merged vocabulary can be found here.

But I am wondering how to run this pretrained model? Obviously Scene Graph Benchmark can't run this pre-trained model since the configuration file is not compatible with that package. I force to change the config file (deleting options one by one until yacs accepts), so I can manage to run the pre-trained model, but results are not right because the number of boxes are too small compared to other detector (which should not be) ...

Any help please?

joeyy5588 commented 3 years ago

Hi, I've managed to run this pretrained model, and the visualization via demo_image.py looks fine. The steps are quite complicated so I am not sure whether I've written down every steps I've made.

  1. Download the Foursets directory using azcopy from here: https://biglmdiag.blob.core.windows.net/vinvl/model_ckpts/od_models/FourSets
  2. Modify the attr_frcnn_X152C4.yaml following the format of sgg_configs/vgattr/vinvl_x152c4.yaml (specifically, remove the redundant config, set META_ARCHITECTURE to "AttrRCNN")
  3. Construct your own dataset following the instructions discussed in #7
  4. Create your test.yaml file as mentioned in #6 and rewrite the DATASETS.TEST to point the yaml file.
  5. Either add LABELMAP_FILE in attr_frcnn_X152C4.yaml or labelmap in your test.yaml file
  6. Adjust MODEL.ROI_HEADS.SCORE_THRESH to your desire value
  7. Set MODEL.WEIGHT to the path of the pretrained weight in your scripts

I've successfully run either tools/test_sg_net.py and tools/demo/demo_image.py, please let me know if I've missing any steps

via815via commented 2 years ago

Hi, I've managed to run this pretrained model, and the visualization via demo_image.py looks fine. The steps are quite complicated so I am not sure whether I've written down every steps I've made.

  1. Download the Foursets directory using azcopy from here: https://biglmdiag.blob.core.windows.net/vinvl/model_ckpts/od_models/FourSets
  2. Modify the attr_frcnn_X152C4.yaml following the format of sgg_configs/vgattr/vinvl_x152c4.yaml (specifically, remove the redundant config, set META_ARCHITECTURE to "AttrRCNN")
  3. Construct your own dataset following the instructions discussed in Correct way to extract image features with VinVL #7
  4. Create your test.yaml file as mentioned in where can I find the dataset yaml in the config file? #6 and rewrite the DATASETS.TEST to point the yaml file.
  5. Either add LABELMAP_FILE in attr_frcnn_X152C4.yaml or labelmap in your test.yaml file
  6. Adjust MODEL.ROI_HEADS.SCORE_THRESH to your desire value
  7. Set MODEL.WEIGHT to the path of the pretrained weight in your scripts

I've successfully run either tools/test_sg_net.py and tools/demo/demo_image.py, please let me know if I've missing any steps

Did you run the code on your own dataset? I met some problems, can you help me?