soumik12345 / Hemm

A holistic evaluation library for multi-modal generative models using 🤗 Diffusers and Weave
http://geekyrakshit.dev/Hemm/
Apache License 2.0
8 stars 2 forks source link

Implement metrics for evaluating image quality #8

Closed soumik12345 closed 4 months ago

soumik12345 commented 4 months ago
soumik12345 commented 4 months ago

Hi @sayakpaul, can you please check the way I'm computing FID in this PR before I proceed to add the other metrics?

Here's a sample evaluation trace on a small subset 10 datapoints from the coco validation set https://wandb.ai/geekyrakshit/t2i_eval/weave/calls/1d34168d-bf4e-4364-908b-f1b323152ce6?tracetree=1

soumik12345 commented 4 months ago

Hi @sayakpaul, can you please check the way I'm computing FID in this PR before I proceed to add the other metrics?

Here's a sample evaluation trace on a small subset 10 datapoints from the coco validation set https://wandb.ai/geekyrakshit/t2i_eval/weave/calls/1d34168d-bf4e-4364-908b-f1b323152ce6?tracetree=1

I am closing this PR, because I made one mistake: metrics like FID are calculated between distributions and not between individual images.

sayakpaul commented 4 months ago

@soumik12345 LMK when you have fixed the issues you found and have opened a new PR.

sayakpaul commented 4 months ago

Also, we don't need to report all the metrics here I think they are quite outdated TBH. No image generation paper from the recent days have reported them as per my knowledge and observation. So, FID and CMMD are fine.

Also, don't land all the metrics in a single PR. Let's tackle them one by one, please.