Dataset-PointingQA - Githubissues

IQ250 commented 1 year ago

Hello authors. Thanks for your effort in dataset contribution, but I feel confused to align the image in PF-1M to the open-source dataset [PointingQA]. For example,

/pointingga-main/Datasets/LookTwiceQA/images_with_points_train/train_42636.jpg how to find this image in [PointingQA] or [Visual Genome]?

ChenDelong1999 commented 1 year ago

Hi,

Thank you for reaching out and bringing this issue to our attention. We sincerely apologize for any confusion that may have arisen.

For datasets such as PointQA and RefCOCOg, we create visual prompts (e.g., colored arrows or bounding boxes) based on the given annotations, for example:

The image index (e.g., 42636) does not correspond to the original VG dataset, and we will upload the rendered images soon.

IQ250 commented 1 year ago

I see. Thanks, and looking forward to your paper being accepted

ChenDelong1999 commented 1 year ago

Hi @IQ250 , the raw images, resized to their shortest dimension of 336 pixels, have been uploaded to the Hugging Face dataset repository. You can access it here. This zip file contains the rendered image data for PointQA (LookTwiceQA and LocalQA), RefCOCOg, and the RET-3 remote sensing captioning datasets (RSITMD, RSICD, UCM).

Below, you'll find a table that maps each dataset folder within the zip file to the corresponding image path in PF-1M:

Dataset	Zip Folder	Image Path in PF-1M
RefCOCOg	`resized_images_refcocog.zip`	`/refcocog/images_with_bbox/`
PointQA - LookTwiceQA	`resized_images_looktwice.zip`	`/pointingqa-main/Datasets/LookTwiceQA/images_with_points_train/`
PointQA - LocalQA	`resized_images_localqa.zip`	`/pointingqa-main/Datasets/LocalQA/images_with_points/`
RET-3	`resized_images_rsitmd_rsicd_ucm.zip`	`/rsitmd_rsicd_ucm/images_no_tif/`

I hope this helps! If you've got any more questions, just let me know.

ChenDelong1999 / polite-flamingo

Dataset-PointingQA #2