Open ZzzTD opened 1 year ago
Style: the style is obtained by the Gram matrix, which captures the texture information but not the global arrangement, meaning that how the actual input image looks like doesn't matter, but just the local features. By starting from a white noise image, the neural network minimizes the mean-squared error of the Gram matrices of the original image and the generated image.
Content: the neural network compares the pixels between the input image and the generated image and uses the squared error loss to minimize the loss, so that the generated and the original images look as similar to each other as possible.
The total loss function combines the content and the style loss, so that both the content and style can be seen in the generated image. $L{\text{total}} = \alpha L{\text{content}} + \beta L_{\text{style}}$
Here is a more detailed explanation: https://towardsdatascience.com/how-do-neural-style-transfers-work-b76de101eb3
@messierandromeda Thank you very much
May I know how to obtain style reconstruction and content reconstruction?Thanks!