InstantStyle / InstantStyle

InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
https://instantstyle.github.io/
1.43k stars 84 forks source link

Is this compatible with IP- Adapter faceID, can you explain how to use this code with pre trained IP-Adapter FaceID #27

Open unboxdisease opened 2 months ago

unboxdisease commented 2 months ago

I have a pretrained IP-Adapter that uses adaface embeddings, i am trying to stylize the outputs, i think this method can help, but i dont think it is compatible with faceID is it?

tanghengjian commented 2 months ago

FYI https://github.com/tanghengjian/instantid_with_ipa/commit/0a3b7334e4cd8b1bfa410a805f556b7b8c17ed84

haofanwang commented 2 months ago

Our team is working on integrating InstantStyle into native diffusers API (check this PR), once it merged, we will officially support multiple IPA, InstantID+InstantStyle, InstantStyle+IP-Adapter.

plienhar commented 1 month ago

@haofanwang Do you have a code example for InstantID+InstantStyle? Loading of InstantID IP adapter is pretty custom, it is hard to figure out how that interacts with a second adapter loaded separately when you don't know the inner workings of diffusers. Many thanks in advance!

elismasilva commented 1 month ago

@haofanwang Do you have a code example for InstantID+InstantStyle? Loading of InstantID IP adapter is pretty custom, it is hard to figure out how that interacts with a second adapter loaded separately when you don't know the inner workings of diffusers. Many thanks in advance!

even multiple ipa on diffusers is not working properly, maybe shape problem of models. I am trying do my own pipeline to load instant id and instant style together but i am stucking in image emb problem because instant id use controlidentity with 3d shape tensors, diffusers does 4d tensors, and other problem is attn processors.

xiankgx commented 1 month ago

Is this an issue related to IP-Adapter vs IP-Adapter Plus?

IP Adapter plus works not with image embeddings, but image "fine-grained features". For example, IPAdapterXL vs IPAdapterPlusXL image embeddings shape is (1, 1280) vs (1, 257, 1280) respectively. This suggests the Plus SDXL models learned from more fine grained features (features from image patches).

This means embedding arithmetic like "king" - "man" + "woman" as demonstrated with this repo will likely not work.