linfengWen98 / CAP-VSTNet

[CVPR 2023] CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
MIT License
118 stars 8 forks source link

Can you explain a little bit more about the matting loss? #15

Open yuhaoliu7456 opened 7 months ago

yuhaoliu7456 commented 7 months ago

In "Matting Laplacian [22] may result in blurry images when it is used with another network like one with an encoder-decoder architecture. But it does not have this issue in CAP-VSTNet, since the bijective transformation of reversible network theoretically requires all information to be preserved."

I'm curious about the relationship between the bijective transformation of a reversible network that theoretically needs to preserve all information and the image blur caused by Laplacian matting? Even after reading this paragraph several times, I still cannot fully understand your point here.

So can you please kindly help me figure this out?

Thanks.

linfengWen98 commented 6 months ago

1.Network Given input $X$ and $X{smooth}$ , we have, $\quad Net.Forward (X) \rightarrow Y$ , $\quad Net.Forward (X{smooth}) \rightarrow Y{smooth}$ . For reversible network, $Y \neq Y{smooth}$ when $X \neq X{smooth}$ . For encoder, it's possible that $Enc(X) = Enc(X{smooth}) = Y_{smooth}$ .

Similarly, $\quad Net.Backward (Y) \rightarrow X$ , $\quad Net.Backward (Y{smooth}) \rightarrow X{smooth}$ . For reversible network, $X \neq X{smooth}$ when $Y \neq Y{smooth}$ . For decoder, it's possible that $Dec(Y) = Dec(Y{smooth}) = X{smooth}$ .

2.Linear Transform Assuming the feature $Y$ and $Y{smooth}$ have the same content factor $C$ , we have, $\quad L*C \rightarrow Y$ , $\quad L{smooth}*C \rightarrow Y_{smooth}$ .

The stylized result depends on $L$ and covariance matrix $\Sigma$ ($\Sigma=L*L^T$).