Details about Visual-genome Dataset

salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BSD 3-Clause "New" or "Revised" License

4.86k stars 648 forks source link

Closed FingerRec closed 2 years ago

FingerRec commented 2 years ago

Thanks for this good work!

But the vg_caption.json not indicated how to process these two subset.

Can you kindly provide more details about processing vg dataset?

BTW, you use V1.0 or V1.1 for VG?

woctezuma commented 2 years ago

Maybe related:

FingerRec commented 2 years ago

Solved. thanks for your timely feedback