Closed TruthSearcher closed 1 year ago
Thank you for the questions!
Changing the facial expression is indeed hard. I think this is due to our training where we utilize subject crops from the original image as conditioning. Retrieval-based training may improve it and test-time conditioning on multiple reference images will also help. We may also need to use specific tools like StyleGAN2 for certain fine-grained edits. We will add this discussion to the paper later.
I think currently stylization works quite well (also see figure 6). It is essentially a tradeoff between identity preservation and prompt consistency though. If we want to push for style consistency, we can lower the alpha, and get a more stylized image with some loss in identity preservation. Note that this is also style-dependent. For some styles, you can do well while preserving the identity (e.g. the pointillism painting), for the others (like woodblock), painting of this style doesn't get much facial details, so more identity preservation essentially makes it less stylized.
I checked the examples:
It seems to retain the same photorealistic style as that input image and the same facial expression.