Zeju1997 / oft

Official implementation of "Controlling Text-to-Image Diffusion by Orthogonal Finetuning".
https://oft.wyliu.com/
MIT License
280 stars 14 forks source link

How to get COCO dataset? #19

Closed DaShenZi721 closed 7 months ago

DaShenZi721 commented 7 months ago

How to download and place COCO dataset such that I can get the directory structure as following?

└── COCO │ ├── train │ │ ├── color │ │ ├── depth │ ...

Zeju1997 commented 7 months ago

For COCO dataset, we do not host the data for download, but you can easily download it from the official website (train/val 2017, train split for training and val for evaluation) https://cocodataset.org/#download We used BLIP to generate the captions to be consistent across different datasets. Here are the captions for reference: coco_prompt_train_blip.json coco_prompt_val_blip.json

csguoh commented 4 months ago

Hi, sorry to bother you.

I have downloaded the COCO 2017val "2017 Val images [5K/1GB]" in the official website. However, I find the file name is not consist with the prompt json file you provided.

The image list of the unziped COCO 2017 val is as follows:

image

The prompt json with BLIP is as follows:

image

It seems the prompts do not match the images. Did I misunderstand something?

DaShenZi721 commented 4 months ago

Hi, sorry to bother you.

I have downloaded the COCO 2017val "2017 Val images [5K/1GB]" in the official website. However, I find the file name is not consist with the prompt json file you provided.

The image list of the unziped COCO 2017 val is as follows: image

The prompt json with BLIP is as follows: image

It seems the prompts do not match the images. Did I misunderstand something?

Because the prompt jsons with BLIP are in an inconsistent order with origin COCO dataset, you'd better regenerate the prompts by following the process procedure of COCO in the appendix.

csguoh commented 4 months ago

Get it! Thanks for your reply.