Closed GuCheng123 closed 4 months ago
Hi, we use the ADM’s TensorFlow evaluation suite (link) for evaluation. After you get the generated images, convert them into an .npz file (only ImageNet needs to do this manually, others are generated automatically) and run the evaluation. The script for conversion is as follows:
def create_npz_from_sample_folder(sample_dir, npz_path, num=50_000):
"""
Builds a single .npz file from a folder of .png samples.
"""
samples = []
files = os.listdir(sample_dir)
for i in tqdm(range(len(files)), desc="Building .npz file from samples"):
sample_pil = Image.open(f"{sample_dir}{files[i]}")
sample_np = np.asarray(sample_pil).astype(np.uint8)
samples.append(sample_np)
samples = np.stack(samples)
print(samples.shape)
np.savez(npz_path, arr_0=samples)
print(f"Saved .npz file to {npz_path} [shape={samples.shape}].")
return npz_path
My testing flow is as follows: For example, for Class-conditional, we generate 1000 classes of images from class-Diffusin, then randomly extract 1000 classes of images from Imagenet, resize and crop-resize into images of the same size 256. Finally, torch_fidelity is used to measure the accuracy. But the accuracy is very different from your paper. Is my testing process wrong? How do you test these indicators?