Question about normalization

autonomousvision / stylegan-xl

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets

MIT License

964 stars 113 forks source link

Question about normalization #41

Closed zqh0253 closed 2 years ago

zqh0253 commented 2 years ago

Hi, I notice that the normalization of input images is different from traditional normalization way ( x = (x-mean)/std. ):

def norm_with_stats(x, stats):
    x_ch0 = torch.unsqueeze(x[:, 0], 1) * (0.5 / stats['mean'][0]) + (0.5 - stats['std'][0]) / stats['mean'][0]
    x_ch1 = torch.unsqueeze(x[:, 1], 1) * (0.5 / stats['mean'][1]) + (0.5 - stats['std'][1]) / stats['mean'][1]
    x_ch2 = torch.unsqueeze(x[:, 2], 1) * (0.5 / stats['mean'][2]) + (0.5 - stats['std'][2]) / stats['mean'][2]
    x = torch.cat((x_ch0, x_ch1, x_ch2), 1)
    return x

Is there any paper or previous work introduces this kind of normalization method?

woctezuma commented 2 years ago

It is a bit weird as the normalization amounts to:

x_offset = (x+1)/2
x_norm = (x_offset - std) / mean

I would have expected as you wrote

x_norm = (x - mean) / std

Code for norm_with_stats():

https://github.com/autonomousvision/stylegan_xl/blob/aa6531372d3517cfe3157631093191e8cfea2aaf/pg_modules/projector.py#L9-L14

https://github.com/autonomousvision/stylegan_xl/blob/aa6531372d3517cfe3157631093191e8cfea2aaf/pg_modules/projector.py#L117-L121

https://github.com/autonomousvision/stylegan_xl/blob/ec0f21a6ec5b16d8a16f8e23147498defa303cdc/training/loss.py#L108-L110

Code for get_backbone_normstats():

https://github.com/autonomousvision/stylegan_xl/blob/ec0f21a6ec5b16d8a16f8e23147498defa303cdc/training/loss.py#L60

https://github.com/autonomousvision/stylegan_xl/blob/aa6531372d3517cfe3157631093191e8cfea2aaf/pg_modules/projector.py#L16-L33

xl-sr commented 2 years ago

it's unnormalization + normalization in a single op. But currently, it is somewhat cryptic and there is a small bug. I made a push to fix it :)

woctezuma commented 2 years ago

I see the commit.

9a6de7ed70e7980c04298da3916a3f9492322fba restructure normalization

So "un-normalization" is:

x_offset = (x+1)/2

https://github.com/autonomousvision/stylegan_xl/blob/9a6de7ed70e7980c04298da3916a3f9492322fba/pg_modules/projector.py#L115

And "normalization" is Normalize:

x_norm = (x_offset - mean) / std

https://github.com/autonomousvision/stylegan_xl/blob/9a6de7ed70e7980c04298da3916a3f9492322fba/pg_modules/projector.py#L3

https://github.com/autonomousvision/stylegan_xl/blob/9a6de7ed70e7980c04298da3916a3f9492322fba/pg_modules/projector.py#L105

The small bug was that mean and std were exchanged, right?

x_norm = (x_offset - std) / mean

xl-sr commented 2 years ago

correct :)