soumik12345 / Hemm

A holistic evaluation library for multi-modal generative models using 🤗 Diffusers and Weave
http://geekyrakshit.dev/Hemm/
Apache License 2.0
8 stars 2 forks source link

Idea for computing FID for text-conditional Diffusion models #9

Open soumik12345 opened 3 weeks ago

soumik12345 commented 3 weeks ago

The idea is to use the ImageNet-1K dataset to compute FID for a text-conditional diffusion model. We could take a 100 images from each imagenet class as real images, and generate 100 images against them using a prompt like f"a photograph of {class}" using the diffusion model to be evaluated and then use these distribution of images to calculate FID for each ImageNet class. The final FID score could be a mean of the FID scores for each of the scores corresponding to the classes.

soumik12345 commented 3 weeks ago

@sayakpaul would love to know your opinion on this.