Support for PhotoMaker - Githubissues

Acly / krita-ai-diffusion

Streamlined interface for generating images with AI in Krita. Inpaint and outpaint with optional text prompt, no tweaking required.

https://www.interstice.cloud

GNU General Public License v3.0

6.9k stars 339 forks source link

Support for PhotoMaker #362

Closed Danamir closed 9 months ago

Danamir commented 9 months ago

PhotoMaker is pretty straightforward to implement, and offers results somewhat better than FaceID.

ComfyUI just added some experimental support without a custom node in this commit. But I had better results with PhotoMaker-Plus, which is practically the same thing, but with multiple images input and multiple trigger words support.

The workflow with ConditionningSetTimestepRange works pretty well to adjust the strength, and a String Function node was used to inject the trigger word in the prompt after the first comma.

I got correct results as long as the render target is not photorealistic. Using a style weight around 0.1 (varying from 0.0 to 0.3 depending on the source), and 1 to 3 reference pictures.

Workflow :

photomaker.json

workflow

Acly commented 9 months ago

The Lora+Conditioning combo is quite interesting and maybe has some advantages over model embedding, although you can have masking, strength and start/end control in either case.

I'm inclined to wait a little now and see if there is a clear "winner" regarding FaceID, InstantID, PhotoMaker. Likely there isn't, my impression is that FaceID is better at replicating faces (but some luck required), PhotoMaker is better at style changes. No support for SD1.5 I believe.

FaceID also has a "portrait" model (not implemented here) which takes multiple images as reference.

Danamir commented 9 months ago

Yes, there is no support for SD 1.5 right now. From my tests, PhotoMaker is very good at transferring a person resemblance to a stylized image like cartoon or anime. It is also really lightweight, with almost no impact on generation time and ram usage.

Sadly it is using its own clip encoding, which doesn't support SDXL G & L prompts that I like to use (I made those available for the plugin on my fork).