Open nldhuyen0047 opened 2 months ago
Hi @nldhuyen0047, thank you for your interest in our work. I noticed that you previously mentioned an issue related to model training. Are the inference results based your own training model? Also, could you post the content image, style image, and the generated results?
I met the same problem. For example, when the content image and the style image are the same, the output is supposed to be the same image as well (some works introduces identity loss
to guarantee this). But with StyleShot, the output differs greatly with the input image.
Hi @zhihongz , thank you for your interest in our work. I randomly test some cases that have same content and style images, the outputs are the same image as well, here are the samples:
I think your difference comes from the resolution of inputs. Style image will be center crop in 512*512, while content image will not be processed, which might lead to difference.
Emm, the output looks similiar to the input, but there are obvious differences. For example, the color of the cat in your last screen shot changes to blue.
When you test with some natural images, the differences become even larger. You can exam this pic:
In StyleShot, styles are integrated into style embeddings by a specially designed style-aware encoder and then incorporated into the diffusion model through a cross-attention module. This highly aggregative style information is not suitable for pixel-level reconstruction.
Got it, thanks for your patient and quick reply.
In StyleShot, styles are integrated into style embeddings by a specially designed style-aware encoder and then incorporated into the diffusion model through a cross-attention module. This highly aggregative style information is not suitable for pixel-level reconstruction.
Sorry for my late reply.
I have encountered some problems with the training model.
I got it. Thank you very much.
Hi, thanks for your excellent work.
I have a question about style transferring. I would like to transfer style of the target image into the content image, when inference, I test on both
Contour
andLineArt
, but the result is not good.With my datasets use for the inference, I use a cube with no details on it (I would like to add details for this) for the content image, and a house with details and colors for the target image. With the
LineArt
, the colors and some details needing to add in the content image is not filled, and with theContour
, the image changed so much and be quite messy.Could you please give me some solutions for improving the result?
Thank you so much.