Closed diamond0910 closed 2 years ago
Thanks for your interest of our work. Note that we do not replace noise injection with SPE. To purely measure the performance of the generator backbone, we remove noise injection in all experiments. The zero padding of convolution could provide model absolution pixel position when generating, which is missing in transformers. So we add SPE to provide absolution global position for transformer generator.
Thanks for your reply. It seems that adding noise will improve the performance in StyleGAN1. How much performance gain have you tried with StyleSwin by adding noise?
In experiments, we have observed that simply adding noise would not result in significant performance improvement. We hypothesis that the noise input may take effect with specific architecture or components (like anti-aliasing upsampling, which we do not use in the StyleSwin), further improvement of the transformer generator is under exploration.
Compared with Stylegan2, I notice that you you replace noise with SPE at the same place. What are the differences between SPE and noise? Can SPE achieve the effect of noise? Seems like SPE is a fixed vector?
Thanks.