davidnvq / grit

GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)
185 stars 28 forks source link

The Visual Genome dataset #38

Closed bai-24 closed 1 year ago

bai-24 commented 1 year ago

Dear Author, The Visual Genome dataset I downloaded was not partitioned. How to divide the Visual Genome dataset into training, testing, and validation sets?How to form the annotations folder?

davidnvq commented 1 year ago

You can find the splits here. https://github.com/peteanderson80/bottom-up-attention/tree/master/data/genome

I also converted them into the json files according to COCO annotation format. Feel free to download them (including the notebook) from here: https://drive.google.com/drive/folders/1c7eWTjrlo_UJuKH3GwHIvTYMwzTkbIRd?usp=sharing

bai-24 commented 1 year ago

Thank you.But there are many files in vg_train.yaml, vg_val.yaml, vg_test.yaml and coco_val.yaml that I don't know how to generate. Can you provide them?

davidnvq commented 1 year ago

Can you check here? https://github.com/davidnvq/grit/tree/main/configs/detection/datasets

bai-24 commented 1 year ago

train_ann_lmdb, train_objects.json, vgcocooiobjects_v1_class2ind.json, attribute2ind.json, oid2attr.json in vg_train.yaml. val_objects.json, val_coco.pkl in vg_val.yaml. test_objects.json, test_coco.pkl in vg_test.yaml. anno_1848_val2017.json, coco_vgoiv6_class2ind.json in coco_val.yaml.

davidnvq commented 1 year ago

@bai-24 Sorry for late reply. You can download all here: https://drive.google.com/drive/folders/1c7eWTjrlo_UJuKH3GwHIvTYMwzTkbIRd?usp=share_link

In fact, you don't need train_ann_lmdb. Please find the corresponding place in Dataset code to turn off the need of this file.