PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BSD 3-Clause "New" or "Revised" License
4.86k
stars
648
forks
source link
Details about Visual-genome Dataset #79
Closed
FingerRec closed 2 years ago
Thanks for this good work!
I find there are two image parts in https://visualgenome.org/api/v0/api_home.html.
But the vg_caption.json not indicated how to process these two subset.
Can you kindly provide more details about processing vg dataset?
BTW, you use V1.0 or V1.1 for VG?