sail-sg / ptp

[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
https://arxiv.org/abs/2212.09737
Apache License 2.0
149 stars 4 forks source link

Pre-exacted Image Feature for Visual Genome(VG) #7

Closed 9115jin closed 1 year ago

9115jin commented 1 year ago

Hello,

I'm currently in the process of re-implementing ptp-blip for my research in the field. However, I encountered an issue during the second step mentioned in ptp/DATASET.md, which is "2. Download/Prepare Corpus (image-text pair)." Following the instructions provided, I was able to download object features for COCO2014 train/val/test, 2015 test, CC3M, and SBU using the provided download_cc3m_predictions.sh script. However, I couldn't find a download link for VG (Visual Genome) and I'm currently searching for it.

If you happen to know the download link for VG or if image features for VG are not required separately, it would be immensely helpful if you could let me know. Once again, I want to express my gratitude for the excellent research work done with ptp.

Thank you!

image image

FingerRec commented 1 year ago

Hi Jin.

For the VG, the image feature is not provided.

Follow "https://github.com/FingerRec/OA-Transformer/blob/main/ObjectExtractor/multiprocess_full_cc3m_complementary_modify_tsv_gen_from_video.py" or BUTD (https://github.com/MILVLG/bottom-up-attention.pytorch/blob/master/extract_features.py) implementation in to generate vg feature.

9115jin commented 1 year ago

Hi FingerRec,

Thank you for the code to extract the image features for VG. I didn't realize there was a code available for that, so I appreciate you letting me know and sharing the link. Thanks a lot!