Open bhat-prashant opened 1 year ago
Hello, sorry for the late response.
You should be able to modify this script. The drawn_images
folder contains two folders: bbox
and bbox_logs
. bbox
contains two folders, train
and val
, each of which contain a qa
and qar
folder containing the drawn images.
First of all, a great repository for multimodal continual learning. Kudos!
I have a doubt regarding how to populate drawn_images in VCR dataset:
The drawn_images folder for the VCR task can be generated from the original vcr1images, using the scripts available [here](https://github.com/rowanz/merlot/tree/main/downstream/vcr/data).
In the link you have mentioned, there are three scripts namely prepare_data, draw_segms and draw_bbox. I understand these scripts generate segmentation and bbox for the images in vc1images.
Could you please clarify how I populate drawn_images? What is the folder structure, contents etc.
Thanks in advance,