VICO-UoE / DatasetCondensation

Dataset Condensation (ICLR21 and ICML21)
MIT License
473 stars 91 forks source link

different Figures from README #9

Closed EchizenG closed 2 years ago

EchizenG commented 2 years ago

Hi, I tried to recover the the synthetic images from the dataset you offered in Google Drive. These are what I got: syn0-0-0-0 syn0-1-0-0 syn4-8-0-8 Dataset name: DC_MNIST_ConvNet_1ipc.pt I can identify them as 0,1and 8. However, the background is white or grey. Is there anything wrong I did?

Thank you!

PatrickZH commented 2 years ago

Hi, please try this code:

from torchvision.utils import save_image
file_path = './res_DC_MNIST_ConvNet_1ipc.pt'
data = torch.load(file_path)
for exp in range(5):
    img = data['data'][exp][0]
    save_image(img, file_path[:-3]+'_%d.png'%exp, nrow=1)  # Trying normalize = True/False may get better visual effects.
EchizenG commented 2 years ago

This is so cool! Thank you so much! I have a further question. The real images from MNIST were normalized by (mean=0.1307,std=0.3081). So we need to denormalize them firstly. I though I also need to denormalize the synthetic data firstly and multiply 255 then. But it looks redundant to denormalize. Do you know the reason? Thank you again!

PatrickZH commented 2 years ago

You are welcome! The synthetic images are already normalized ones. When training neural networks, you can use the synthetic images directly without any normalization or de-normalization operation. If you visualize them, you may need some operation to convert the pixels to some range for better visual effects.

EchizenG commented 2 years ago

There are two steps for normalization: 1) 0-255 to 0-1; 2)0-1 to Gaussian(0,1). theoretically, denormalization should be: 1)Gaussian(0,1) to 0-1; 2) 0-1 to 0-255. However, for MNIST, it doesn't need 1)Gaussian(0,1) to 0-1 while CIFAR needs it.

PatrickZH commented 2 years ago

Your denormalization process for visualization is correct. But it doesn't mean that MNIST results don't need "1)Gaussian(0,1) to 0-1". Maybe it just looks "better" if you don't do "1)Gaussian(0,1) to 0-1", as the MNIST pixels have quite different distribution from nature pictures.

EchizenG commented 2 years ago

What I am confusing now is does condensed MNIST dataset really need "Gaussian(0,1) to 0-1"? If it needs it, the denormalized images look like what I showed in question, which looks like black digit and white background. If it doesn't need it, the results look good, which looks really similar to the original dataset.

PatrickZH commented 2 years ago

Hi, please try the following code. No matter denormalize ("Gaussian(0,1) to 0-1") or not. The images look good.

from copy import deepcopy
from PIL import Image
file_path = './res_DC_MNIST_ConvNet_1ipc.pt'
data = torch.load(file_path)
for exp in range(5):
    img = deepcopy(data['data'][exp][0]) # already normalized
    mean, std  = 0.1307, 0.3081
    img = img*std + mean # denormalize, you can comment this line and see what happens
    img[img<=0] = 0
    img[img>=1] = 1
    img = img*255
    I = Image.fromarray(img[0, 0].detach().cpu().numpy())
    I.show()
EchizenG commented 2 years ago

Thank you so much for offering this code. I'll try and check it!