pzzhang / VinVL

project page for VinVL
349 stars 25 forks source link

Res152c4 on 4 datasets seems not right #12

Open helson73 opened 3 years ago

helson73 commented 3 years ago

DOWNLOAD.md says We also provide the X152-C4 objecte detection config file and pretrained model on the merged four datasets (COCO with stuff, Visual Genome, Objects365 and Open Images). The labelmap to decode the 1848 can be found here. The first 1594 classes are exactly VG classes, with the same order. The map from COCO vocabulary to this merged vocabulary can be found here. The map from Objects365 vocabulary to this merged vocabulary can be found here. The map from OpenImages V5 vocabulary to this merged vocabulary can be found here.

But I am wondering how to run this pretrained model? Obviously Scene Graph Benchmark can't run this pre-trained model since the configuration file is not compatible with that package. I force to change the config file (deleting options one by one until yacs accepts), so I can manage to run the pre-trained model, but results are not right because the number of boxes are too small compared to other detector (which should not be) ...

Any help please?

zhyunlong commented 3 years ago

Same question, have you found the reason?

zhyunlong commented 3 years ago

Same question, have you found the reason?

helson73 commented 3 years ago

Same question, have you found the reason?

Refer to here They said following their procedure would generate correct results. BTW I haven't verify yet.