Dataset used for predicate extraction

Yuqifan1117 / CaCao

This is the official repository for the paper "Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World" (Accepted by ICCV 2023)

40 stars 5 forks source link

Hello, when I read triplet_extraction.py, I found that for the triplet extraction dataset processing, only coco/image_caption.json was processed and the data was saved in "total_image_region_triplets.json", the cc3m dataset mentioned in the paper has not been used.

In addition, the vg/region_descriptions.json data is additionally processed and saved in "image_caption_triplet.json".

I checked the two json files in the folder "dataset" and their images also correspond to the coco dataset.

My questions are:

Is the cc3m data set abandoned? Switch to using vg dataset for supplementation? If so, when augmenting the vg dataset,the model can be considered to have been trained on this data in advance?
Is the coco/image_caption.json file directly merged from the training and evaluation data sets of the 2014 version?

Tnanks for your help!

Yuqifan1117 / CaCao

Dataset used for predicate extraction #10