Open jackwolfey opened 1 month ago
hi@jackwolfey, thanks for your attention~ You can concatenate multiple reference character features into a feature sequence and then inject into our Denoiser (UNet) through cross-attention. It just needs to modify the style hidden state dimension.
@yeungchenwa Thanks for your reply. I tried to locate the code that needs to be modified, as you mentioned below, but I actually don't know how to modify it. I am a beginner in the AI programming area, so could you please provide me with some detailed information or instructions on how to modify it? Thank you very much.
Does the model's inference support few-shots, or is there a way for the model to take features from multiple reference characters and then inference? Nice work by the way.