crowsonkb / k-diffusion

Karras et al. (2022) diffusion models for PyTorch
MIT License
2.26k stars 372 forks source link

About the data normalization and 'sigma_data'. #10

Closed jwliu-cc closed 2 years ago

jwliu-cc commented 2 years ago

Hello, I have noticed that there is no normalization for images.

transforms

Is this nessceary to train with 'sigma_data=0.5'? As images are usually normalized with 'std=0.5'.

crowsonkb commented 2 years ago

The KarrasAugmentationPipeline scales images to range -1 to 1 (It returns a tuple of three items so you can't put a Normalize after it so I just put the scaling into it), which leaves the std for most image datasets around 0.5 (it was around 0.25 before, in range 0 to 1). Very high contrast datasets, like MNIST or vector art, might have a higher std than 0.5 after scaling to range -1 to 1 and you should measure it and adjust sigma_data accordingly.

jwliu-cc commented 2 years ago

Thanks for your reply and advice! I have tried with normalization (using std=0.5 and the std of images after norm is about 0.6~0.7), and sigma_data is set to 0.5, it also goes well. Using statistical std of the dataset may be more beneficial to training, i will try this later.