rosinality / alias-free-gan-pytorch

Unofficial implementation of Alias-Free Generative Adversarial Networks. (https://arxiv.org/abs/2106.12423) in PyTorch
Other
507 stars 43 forks source link

filter_parameters not being set? #2

Open jjparkcv opened 3 years ago

jjparkcv commented 3 years ago

Hi, I'm trying to find out which cutoff frequencies you actually used for you experiments.

So, I tried to find where is the function 'filter_parameters' is being used, but I can't seem to find it.

Could you point me to where it is being set?

Thanks!

rosinality commented 3 years ago

filte_parameters I have used for 256px is:

sr_n = 256
cutoff_n = sr_n / 2

filter_parameters(
    n_layer=14,
    n_critical=2,
    sr_max=sr_n,
    cutoff=(2, cutoff_n),
    stopband=(2 ** 2.1, cutoff_n * 2 ** 0.3),
    channel_max=512,
    channel_base=2 ** 14
)

which is same with parameters from the papers.

jjparkcv commented 3 years ago

Thanks for the reply @rosinality

Using your parameters, the 'filter_parameters' function outputs cutoff frequency as: p['cutoffs'] => [2.0, 2.82842712474619, 4.0, 5.656854249492381, 7.999999999999999, 11.313708498984761, 16.0, 22.627416997969526, 31.999999999999996, 45.254833995939045, 64.00000000000001, 90.50966799187808, 128.0, 128.0]

However, according to the paper, the cutoff frequency is strictly a function of the resolution 's': f_c = s/2 − f_h, where f_h = (√2 − 1)(s/2). But I see that the first two elements of your cutoff frequency are different, even though their image resolutions 's' are the same.

Could you explain why is your cutoff frequency changes each layer? Perhaps I'm missing something from the paper?

Thank you in advance.

rosinality commented 3 years ago

I adapted flexible layer specifications (config T). Yon can find the formula from F.1 Flexible layer specifications.

jjparkcv commented 3 years ago

Hi, thank you for your clarifications.

On a separate note, could you explain a bit about padding for upsampling and downsampling? The paper says it avoids zero-padding because it introduces positional awareness of the CNN. However, I notice that your 'upsample' function has padding passed to 'upfirdn2d'.

Is zero-padding is being used by the upsample method or some other padding? Also, if there is padding being used, I wonder what's the point of using the 10-pixel margin?

Thank you again!

jjparkcv commented 3 years ago

Ah, I think I get it. So the zero-padding is only affecting the margins, not the main signal?

rosinality commented 3 years ago

It is my guess. Using 10 pixel margin would be not enough to maintain canvas sizes, so I added zero paddings to the upsampling and downsampling. Definitely it could affect the pixels around the border, but I don't know how exactly authors used paddings or magins to prevent reducing canvas sizes.

jjparkcv commented 3 years ago

I see. I think your guess is quite reasonable. Except I wonder if the unbalanced padding (i.e., different right and left padding) would introduce any bias. But other than that, the padding treatment looks quite elegant.