Closed JoakimHaurum closed 1 year ago
Here's the class name dict I used: https://gist.github.com/dbolya/0befcc8b4147d8c0dce37d14120e6779 The only modification I made (and this is in the gist) is to delineate "crane", the bird, and "crane", the construction equipment. And the prompt was
f"A high quality photograph of a {_cls}."
As for the imagenet val images, I can't share those myself (copyright and stuff), but it's easy to sample them. I just took the first 5 val set images for each class. E.g., image IDs 000000, 000001, 000002, 000003, 000004, 000050, 000051, ... . So you can just download the val set and do the same.
Thank you very much, exactly what I needed :)
Is it possible for you to share the exact class names used as prompts as well as a list of the 5K files from the ImageNet val set which was used for the FID computation? I have tried looking into the available list of ImageNet class names online, but they differ in slight ways.
Best, Joakim