grigorisg9gr / polynomial_nets

Official Implementation of the CVPR'20 paper 'Π-nets: Deep Polynomial Neural Networks' and its T-PAMI-21 extension.
Other
163 stars 30 forks source link

Cannot reproduce results on image_generation_pyotch #3

Closed k4ntz closed 1 year ago

k4ntz commented 3 years ago

Hi, I have tried to reproduce the results of the paper for image generation (CIFAR10). I launched main.py with the default parameters, just varying the activation_fn param. I get an IS score of:

Do you have any recommendation on the parameters to be used to reproduce, and/or any recommendation if I try to create a PolyNet to classify CIFAR10 images ? I also don't get very good results on this task

grigorisg9gr commented 3 years ago

Hi,

thanks for your interest in our work. I believe that the model you are using is simply the conversion of the DCGAN generator into a polynomial, right? In addition, the scores of FID/IS are typically reported in papers using the Tensorflow inception network; the one provided for pytorch does not have exactly the same correspondence with human vision.

In other words, it depends what you are looking to replicate exactly; however, the results of IS score > 8 that we report in the paper are not made with DCGAN-polynomials, but rather a custom polynomial architecture. I could share more details if you believe this is helpful. Unfortunately the corresponding experiments are done in Chainer, so we do not have the exact network in PyTorch.

Let us know if something is unclear.

ntlm1686 commented 1 year ago

@grigorisg9gr Hi, are the poly-nets used in the imagenet experiments free of any non-linear activations such as ReLU?

(edit) OK, I saw the residual blocks are "normalized" by tanh.

grigorisg9gr commented 1 year ago

Hi, which experiments are you referring to? Classification or generation? In both cases, in the imagenet experiments we modified standard architectures, e.g. StyleGAN in the generative case, so we maintained their activation functions.

ntlm1686 commented 1 year ago

Hi, @grigorisg9gr . It's the classification experiments on ImagNet. Are tanh functions are used in the polyprod resent? Specifically, in every residual block?

grigorisg9gr commented 1 year ago

If you are referring to the version in T-PAMI (i.e. https://arxiv.org/pdf/2006.13026.pdf), yes we do have a tanh in each block. This tanh was added to stabilise the training.

ntlm1686 commented 1 year ago

Thanks, @grigorisg9gr