Open SlZeroth opened 1 year ago
If you want generate something like, "an image of A and B shaking hands", after fine-tuning the model on photos of A and B. Then I think revision of pipeline_stable_diffusion_promptnet.py is needed. Current implementation only generates conditioned on prompt like "a photo of S" or conditioned on a list ["a photo of $S_1^ $", "a photo of $S_2^ $", ... ], which is different from "a photo of $S_1^ \text{ and } S_2^* $".
thank you for answer !
Is it possible to add multiple tokens while keeping promptnet technology intact?
I'm trying to make multi-token possible by modifying the source code as you posted in your reply, but before that, I wonder if this is easily possible for you.
I may update the code later, but I'm not sure how the performance will be.
@drboog Thank you so much!
I wrote an implementation and tested it on my local machine. Unfortunately, the performance is not satisfying. For example, when we ask it to generate a photo of A and B shaking hands, it for sure generates an image of two people shaking hands. However, each one of these two people looks like a combination of A and B. But what we expect is one person looks like A, the other person looks like B. This is an interesting topic, I will think about improvement (on method or trick) in the future.
thank you for trying. I have read the C-LORA paper, and it seems to address issues related to learning when dealing with the same type of subject. https://arxiv.org/abs/2304.06027
I want to create two people simultaneously, is multiple subjects possible?