yuvalkirstain / PickScore

MIT License
446 stars 26 forks source link

MS-COCO #6

Closed h4nwei closed 1 year ago

h4nwei commented 1 year ago

Hello @yuvalkirstain,

Thanks for your outstanding work. I am writing to inquire if it is possible to obtain the labeled MS-COCO validation set, as it would be valuable for conducting comparisons with FID and PickScore simultaneously. I would greatly appreciate any assistance you can offer.

Best regards, Hanwei

yuvalkirstain commented 1 year ago

Sure thing!

Download the data:

mkdir data
cd data
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip val2017.zip
unzip annotations_trainval2017.zip
cd ..

Process the data:

print("Loading captions")
data = json.load(open("annotations/captions_val2017.json"))
dataset = defaultdict(list)
print("Loading images")

for dp in tqdm(data['images']):
    for k, v in dp.items():
        dataset[k].append(v)
    image = Image.open(f"val2017/{dp['file_name']}").convert("RGB")
    dataset["image"].append(image)

print("Creating dataset object")
ds = Dataset.from_dict(dataset)
print("Saving dataset")
ds.save_to_disk("cocoval2017")
print("Mapping captions")
id2caption = {data["annotations"][i]["image_id"]: data["annotations"][i]["caption"] for i in
              range(len(data["annotations"]))}
ds = ds.map(lambda e: {"caption": id2caption[e["id"]]})
print("Saving dataset")
ds.save_to_disk("cocoval2017")

If this answers your question please close the issue :)

h4nwei commented 1 year ago

Thank you for your immediate reply. Except for the raw data from Coco, I want to know whether the generated images from 9 different models and the corresponding human preferences can be accessed. Then, I am able to compute the correlations between FID and human experts, which is the same as Fig. 6 in the manuscript. Thank you.

yuvalkirstain commented 1 year ago

Oh I see. We haven't yet uploaded the generated images and corresponding human preferences. If you wish to evaluate your scoring function, I strongly urge you to focus on the second experiment in which we calculate Elo ratings from the different scoring functions.

h4nwei commented 1 year ago

OK, I see. I appreciate your suggestion.