Open Bilal143260 opened 10 months ago
I think it should, maybe combining controlnet will be better
Thanks for the reply @xiaohu2015 . Could you please provide any reference material for training IP-adapter with control-net?
For reference, I tried training Pix-to-pix-instruct, that didn't workout. I also trained a open-pose control net but it doesn't capture the details of dresses. Now, I have to engineer that how to combine your training script with that of controlnet script. Any collaborations, suggestions or maybe a paid consultancy is highly appreciated.
For reference, I tried training Pix-to-pix-instruct, that didn't workout. I also trained a open-pose control net but it doesn't capture the details of dresses. Now, I have to engineer that how to combine your training script with that of controlnet script. Any collaborations, suggestions or maybe a paid consultancy is highly appreciated.
a good start is using openpose controlnet (but replace condition with image condition) + IP-Adapter (with image condition)
for cloth image condition, I think you can use dino to extract image features
So what you mean is I train the stable diffusion using IP-adapter (on viton dataset) and then during inference I use open pose control net and add cloth image condition in it (maybe a clothing mask from dino etc) + IP adapter. Correct me if I am wrong?
So what you mean is I train the stable diffusion using IP-adapter (on viton dataset) and then during inference I use open pose control net and add cloth image condition in it (maybe a clothing mask from dino etc) + IP adapter. Correct me if I am wrong?
I mean you train a IP-Adapter + contrlonet together
Thank you for all suggestions. I will give it a shot.
So what you mean is I train the stable diffusion using IP-adapter (on viton dataset) and then during inference I use open pose control net and add cloth image condition in it (maybe a clothing mask from dino etc) + IP adapter. Correct me if I am wrong?
I mean you train a IP-Adapter + contrlonet together
Do you mean training an IP-adapter and a new openpose controlnet at the same time? Or just train the IP-adapter and fix the controlnet?
training together
For reference, I tried training Pix-to-pix-instruct, that didn't workout. I also trained a open-pose control net but it doesn't capture the details of dresses. Now, I have to engineer that how to combine your training script with that of controlnet script. Any collaborations, suggestions or maybe a paid consultancy is highly appreciated.
a good start is using openpose controlnet (but replace condition with image condition) + IP-Adapter (with image condition)
for cloth image condition, I think you can use dino to extract image features
If so, we should change the Image Encoder of the IP-Adapter from CLIP to, for instance, DINOv2? @xiaohu2015
Is it a good idea to train IP-Adapter to act like a viton (virtual try-on)? The training data would include images of cloth and prompts as input and a model wearing that dress as ground truth.