microsoft / GLIP

Grounded Language-Image Pre-training
MIT License
2.07k stars 186 forks source link

MMDetection supports GLIP inference and fine-tuning for now. #139

Open hhaAndroid opened 9 months ago

hhaAndroid commented 9 months ago

Hi All: MMDetection supports GLIP inference and fine-tuning for now. The mAP we achieved in our reproduction is higher than the official results

Model Zero-shot or Funetune COCO mAP Pre-Train Data Config Download
GLIP-T (A) Zero-shot 43.0 O365 model
GLIP-T (A) Funetune 53.1 O365 model| log
GLIP-T (B) Zero-shot 44.9 O365 model
GLIP-T (B) Funetune 54.1 O365 model| log
GLIP-T (C) Zero-shot 46.7 O365,GoldG model
GLIP-T (C) Funetune 55.2 O365,GoldG model| log
GLIP-T Zero-shot 46.4 O365,GoldG,CC3M,SBU model
GLIP-T Funetune 55.2 O365,GoldG,CC3M,SBU model| log
GLIP-L Zero-shot 51.3 FourODs,GoldG,CC3M+12M,SBU model
GLIP-L Funetune 59.4 FourODs,GoldG,CC3M+12M,SBU model| log

Details for https://github.com/open-mmlab/mmdetection/blob/dev-3.x/configs/glip/README.md

If you encounter any issues while using it, please feel free to create an issue.