the nuimage 256x256 recurrence

KaiChen1998 / GeoDiffusion

Official PyTorch implementation of GeoDiffusion in ICLR 2024 (https://arxiv.org/abs/2306.04607)

MIT License

64 stars 3 forks source link

As claimed in our README file, we recommend users follow the exact setting of the LAMA repo step by step for precise evaluation on the COCO-Stuff Layout-to-Image benchmark, since the FID calculation is extremely sensitive to the implementation details.

Currently, I notice the following differences:

Following LAMA, we filter the original COCO validation set, resulting in a validation set of 3,097 images. Therefore, we generate 3097x5=15485, and then calculate FID with the real 3097 images. We have provided the filtered validation image list in HuggingFace.
Check more details in the Section 5.1 of the LAMA paper.
LAMA uses a tensorflow-based FID calculation code instead of pytorch_fid.
Frankly speaking, there is not much difference between 15.90 FID and 14.58 FID.

KaiChen1998 / GeoDiffusion

the nuimage 256x256 recurrence #16