GaParmar / clean-fid

PyTorch - FID calculation with proper image resizing and quantization steps [CVPR 2022]
https://www.cs.cmu.edu/~clean-fid/
MIT License
943 stars 72 forks source link

Resize per channel reasons #48

Open iroilomamnou opened 1 year ago

iroilomamnou commented 1 year ago

Thanks for sharing this great work. I'm just wondering about the reasons behind resize the image per channel and not resizing it as 3 channels array. I'm referring to this piece of code in the file resize.py.

def resize_single_channel(x_np, output_size):
    s1, s2 = output_size
    img = Image.fromarray(x_np.astype(np.float32), mode='F')
    img = img.resize(output_size, resample=Image.Resampling.BICUBIC)
    return np.asarray(img).clip(0, 255).reshape(s2, s1, 1)
def resize_channels(x, new_size):
    x = [resize_single_channel(x[:, :, idx], new_size) for idx in range(3)]
    x = np.concatenate(x, axis=2).astype(np.uint8)
    return x

I hope this is the way to ask questions here since It's my first time to ask a question on Issues instead of submitting an issue.

Thanks for your understanding.

GaParmar commented 10 months ago

Hi thanks for your interest in this work! We perform the resizing this per channel so that we can use mode="F" which does not quantize the image to 8-bit float and keeps the image values as type float 32.