facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
29.92k stars 7.4k forks source link

Pretrained model on Visual Genome? #993

Open VSJMilewski opened 4 years ago

VSJMilewski commented 4 years ago

🚀 Feature

I was wondering if pretrained models on the Visual Genome dataset will be released? Right now, all the models in the MODEL_ZOO are trained and evaluated on MS-COCO.

Motivation & Examples

Many papers have been published where they used Faster-RCNN features extracted from a model trained on Visual Genome. To compare with these publications, it would be good to use a similar pretraining. Since this would be helpful for many researchers so they easily extract comparable features from data, it would be good to add this to detectron2. Plus it would save a lot of time/energy if not everybody had to train this.

Thanks in advance.

Vimos commented 4 years ago

+1. Many vision BERT based models are using VG pretrained features.

Yudezhi commented 4 years ago

+1 Or if there is an official code to train with VG dataset. I met some problems these days.Thanks!

nilinykh commented 4 years ago

+1. Would be great to have Detectron2 pre-trained on VG dataset. Useful for variouls NLP-related tasks, image captioning, for example.

Yudezhi commented 4 years ago

VG dataset do not have mask annotations, so it can not train the model for instance segment. 《Learning to segment Every thing》shows how to train with COCO and VG.

endernewton commented 4 years ago

(just saw this issue) If it helps, we have released a VG pre-training codebase based on detectron2 in https://github.com/facebookresearch/grid-feats-vqa. While the focus is on grid features, we also included a faithful reimplementation for the bottom-up features proposed in UpDn model (https://arxiv.org/abs/1707.07998).