luo3300612 / image-captioning-DLCT

Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
BSD 3-Clause "New" or "Revised" License
194 stars 31 forks source link

Can you upload resources to another cloud like Gdrive, Onedrive or Dropbox? #17

Closed khiemledev closed 3 years ago

khiemledev commented 3 years ago

In my country, the download speed from Baidu is very slow and I can't down needed resources. Can you please upload them to GDrive, OneDrive, or Dropbox? Thank you!

luo3300612 commented 3 years ago

Sorry, since the whole file is ~70G. I am not able to afford to upload it to GDrive/OneDrive. But you can follow the Data preparation step. There are 5 keys in my hdf5 feature file. The first three keys can be obtained when extracting region features with extract_region_feature.py. The forth key can be obtained when extracting grid features with code in grid-feats-vqa. The last key can be obtained with align.ipynb.

khiemledev commented 3 years ago

I do some tricks and successful to download the files. Thank you for your reply!

I have another question. Can you please tell me how to produce coco_train_ids.npy, coco_test_ids.npy, and coco_restval_ids.npy files for my own dataset already in COCO format?

luo3300612 commented 3 years ago

coco_train_ids.npy is a (N,) numpy array where N=the number of images for training. It contains id to specify the image-text pair in captions_train2014.json:

>>>import json
>>>info = json.load(open('captions_train2014.json'))
>>>annotations = info['annotations']
>>>print(annotations[0])
{'image_id': 318556, 'id': 48, 'caption': 'A very clean and well decorated empty bathroom'}

so the image feature is 318556_features/boxes/size/grids/mask, and the corresponding caption is 'A very clean and well decorated empty bathroom'.

However, since the code is highly limited to the COCO dataset, it is recommended to re-write dataset.py for your own dataset.

Or you need to create the hdf5, train/val/test_ids.npy and captions_train2014.json/captions_val2014.json file for your own dataset.

Hope it helps.

khiemledev commented 3 years ago

It's very helpful. Thank you very much!

YuigaWada commented 1 year ago

For those who don't have a Baidu account, I created a mirror of the data distributed on Baidu Pan. You can download the data from this link without login. Use at your own risk :) (also related to #36 issue)