hatchetProject / QuEST

QuEST: Efficient Finetuning for Low-bit Diffusion Models
26 stars 2 forks source link

How do you measure FID, sFID, IS and CLIP Score? #3

Closed GuCheng123 closed 4 months ago

GuCheng123 commented 4 months ago

My testing flow is as follows: For example, for Class-conditional, we generate 1000 classes of images from class-Diffusin, then randomly extract 1000 classes of images from Imagenet, resize and crop-resize into images of the same size 256. Finally, torch_fidelity is used to measure the accuracy. But the accuracy is very different from your paper. Is my testing process wrong? How do you test these indicators?

hatchetProject commented 4 months ago

Hi, we use the ADM’s TensorFlow evaluation suite (link) for evaluation. After you get the generated images, convert them into an .npz file (only ImageNet needs to do this manually, others are generated automatically) and run the evaluation. The script for conversion is as follows:

def create_npz_from_sample_folder(sample_dir, npz_path, num=50_000):
    """
    Builds a single .npz file from a folder of .png samples.
    """
    samples = []
    files = os.listdir(sample_dir)
    for i in tqdm(range(len(files)), desc="Building .npz file from samples"):
        sample_pil = Image.open(f"{sample_dir}{files[i]}")
        sample_np = np.asarray(sample_pil).astype(np.uint8)
        samples.append(sample_np)
    samples = np.stack(samples)
    print(samples.shape)
    np.savez(npz_path, arr_0=samples)
    print(f"Saved .npz file to {npz_path} [shape={samples.shape}].")
    return npz_path