yekeren / Cap2Det

Implementation of our ICCV 2019 paper "Cap2Det: Learning to AmplifyWeak Caption Supervision for Object Detection"
Apache License 2.0
29 stars 9 forks source link

Other datasets with both bounding box annotations and captions #27

Closed chencjGene closed 4 years ago

chencjGene commented 4 years ago

Hi,

To train and evaluate Cap2Det, datasets with both bounding box annotations and captions are needed (like COCO and flickr30K). I wonder if there are any other datasets like these two?

Best,

Changjian

yekeren commented 4 years ago

Hi, Changjian,

Actually, only evaluation requires bounding boxes (e.g., Flickr does not have bounding box annotations). You can check conceptcaption, which is a large dataset with paired image-caption annotation. If you are asking for an evaluation dataset, maybe you can take a look at visual genome VG), which has bounding boxes and captions. However, VG's captions are describing regions instead of the whole image.

On Fri, Oct 23, 2020 at 10:15 AM Changjian Chen notifications@github.com wrote:

Hi,

To train and evaluate Cap2Det, datasets with both bounding box annotations and captions are needed (like COCO and flickr30K). I wonder if there are any other datasets like these two?

Best,

Changjian

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yekeren/Cap2Det/issues/27, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6CPAIFUL3J26KATROSSITSMGFYHANCNFSM4S4TWUNQ .

-- Thanks, best regards.

Keren

chencjGene commented 4 years ago

Hi keren,

Thanks. Yeah, I am looking for datasets for evaluation. Apart from visual genome, are there other datasets that have bounding boxes and captions describing the whole image?

yekeren commented 4 years ago

There are only a few datasets that meet your requirements. Maybe you can mimic the way processed in the visual commonsense reasoning - use an off-the-shelf object detector to provide the bounding box annotations for evaluation.

On Fri, Oct 23, 2020 at 10:16 PM Changjian Chen notifications@github.com wrote:

Hi keren,

Thanks. Yeah, I am looking for datasets for evaluation. Apart from visual genome, are there other datasets that have bounding boxes and captions describing the whole image?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/yekeren/Cap2Det/issues/27#issuecomment-715657671, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6CPAPVXJKTNMK72UBDKRTSMI2IJANCNFSM4S4TWUNQ .

-- Thanks, best regards.

Keren

chencjGene commented 4 years ago

Got it! Many thanks!