linfengWen98 / CAP-VSTNet

[CVPR 2023] CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer
MIT License
138 stars 10 forks source link

Question about feature combination #3

Closed TheFatBlue closed 1 year ago

TheFatBlue commented 1 year ago

Hi, the performance of your work is really impressive. However, I tried the linear interpolation of the extracted features mentioned in the paper to combine the features of the two styles, but the final effect is not good. Would you give a specific implementation of the interpolation? BTW, the code I use is as follows:

# cWCT.py
def coloring(self, whiten_xc, xs, xs_):
        xs_mean = torch.mean(xs, -1)
        xs = xs - xs_mean.unsqueeze(-1).expand_as(xs)
        xs_mean_ = torch.mean(xs_, -1)
        xs_ = xs_ - xs_mean_.unsqueeze(-1).expand_as(xs_)

        conv = (xs @ xs.transpose(-1, -2)).div(xs.shape[-1] - 1)
        conv_ = (xs_ @ xs_.transpose(-1, -2)).div(xs_.shape[-1] - 1)

        Ls = self.cholesky_dec(conv, invert=False)
        Ls_ = self.cholesky_dec(conv_, invert=False)

        alpha = 0.75
        weighted_Ls = torch.add(torch.mul(Ls, 1-alpha), torch.mul(Ls_, alpha))

        coloring_cs = weighted_Ls @ whiten_xc
        coloring_cs = coloring_cs + xs_mean.unsqueeze(-1).expand_as(coloring_cs)

        return coloring_cs
linfengWen98 commented 1 year ago

Thank you for your interest. The code has been updated. For simpility, we assume the feature map is centered (Section B). In pratice, we need to interpolate the 'mean' as well.