Huage001 / AdaAttN

Officially unofficial PyTorch re-implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.
Apache License 2.0
210 stars 34 forks source link

Mean-variance-norm and Instance Norm #8

Closed sonnguyen129 closed 2 years ago

sonnguyen129 commented 2 years ago

Hi @Huage001 I read the paper and found that mean variance norm mean 'mean-variance channel-wise norm' works quite like instance norm. Can you explain to me why use mean-variance-norm function instead of instance norm? Thank you so much.

Huage001 commented 2 years ago

Hello, Actually, the difference between them is little. One difference is that mean-variance-norm uses unbiased variance estimation while instance norm uses the biased one. You can also try with instance norm but I think there may not be substantial effect on final outputs.

sonnguyen129 commented 2 years ago

I have 1 more question. When testing, is the output size 256 x 256, can the models produce other sizes?

Huage001 commented 2 years ago

In our experiments, the default output size is 512x512. Other sizes are also OK. But it would be better to set the size as a multiple of 16, to avoid problems on down sample and up sample operations.

sonnguyen129 commented 2 years ago

Hi @Huage001 Thank you for your reply. When I read MUNIT paper, those authors said that IN will remove importance style information. image

But in AdaAttN paper. Authors use Norm in AdaAttN module with style features. image

This makes me quite confused. Please explain to me.

Huage001 commented 2 years ago

Since IN removes the style information, we can compute content-wise similarity between content and style images after IN. This similarity is used to aggregate style feature F_s, as shown in the third row of the above figure. The aggregated style feature is not proceeded with IN.

sonnguyen129 commented 2 years ago

Hi @Huage001 Thank you for your reply. What does adaptive in adative attention normalization mean? Can models that can represent formulas like AdaAttN be called adaptive? Is SANet adaptive, I didn't see the author mention it all

Huage001 commented 2 years ago

The name of AdaAttN actually follows AdaIN. "Adaptive" is used to describe the normalization operation, whose parameters are dynamically (adaptively) dependent on the style feature. From this perspective, we can also call SANet, even all the current attention methods "adaptive".

sonnguyen129 commented 2 years ago

Hi @Huage001 Do you think swapping the content and style features for the SANet module or the AdaAttN module makes any difference? image And image Thank you so much

Huage001 commented 2 years ago

In that case, you can imagine the content image would serve as "style reference" while the style image would serve as "content reference". Typically in attention-based style transfer, query (Q) should be contents, key (K) and value (V) should be styles.

sonnguyen129 commented 2 years ago

Hi @Huage001 Thank you for your explaination. I will close this issue. I will re-open if I have another question in the future Wish you all health, success and happiness! Best regards, Son Nguyen.