comfyanonymous / ComfyUI

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
https://www.comfy.org/
GNU General Public License v3.0
58.13k stars 6.17k forks source link

(Question) How di I exactly replicate an image that was made in automatic1111? #2334

Open gluttonium opened 11 months ago

gluttonium commented 11 months ago

Hi everyone, I installed the ComfyUI SD extension and wanted to give it a try. I made some great images in Stable Diffusion (aka. Automatic1111) and wanted to replicate them in ComfyUI. I copied all the settings (sampler, cfg scale, model, vae, ECT...), but the generated image looks different. The image style looks quite the same but the seed I guess or the cfg scale seem off.

Is there a way or a guide somewhere on how to exactly replicate automatic1111 images in ComfyUI?

NeedsMoar commented 11 months ago

You generally can't exactly, but you can get close. If you don't have ComfyUI-Manager yet, get it, then get ComfyUI_ADV_CLIP_emb. That'll give you the option to use a node that weights your prompt the way Automatic1111 does which is very different as well as use the same syntax.

It should be in the main manager menu under the A1111 Alternatives (or something like that) button which has a bunch of nodes designed to imitate or restore Automatic1111 behaviors either closely or exactly. I have things like the CLIP node installed but more for Compel weighting if I feel like trying to remember the syntax or look it up again since I've never used A1111.

NeedsMoar commented 11 months ago

Automatic also has some mechanism that spreads out the CFG over a higher range artificially. Once again there's a node that'll do it but I'd just get used to using lower CFGs than you're probably used to for the same image. You can select what floating point mode you run every stage in from the command line and it'll be honored by the internal nodes for the most part so that's another thing to match up. I didn't have any issues even with the e4m3 8-bit float UNET storage (the math is still higher precision) but that's more useful if you have low VRAM right now until TransformerEngine shows up and if you have an Ada card.

You'll probably also want to avoid using things like Hypertiling / FreeU / and other nodes that speed up or enhance the model unless Automatic had them integrated when you generated your image. This isn't necessarily easy to figure out, you might have to look through the checkins. Where differences will arise will most likely be spots where comfy used autocasting to fp16 or up to fp32 since it'll turn on fp16 automatically on appropriate cards... Finally scheduling of samplers works differently in comfy, for that there's a custom node to try to get Automatic's behavior I think, but it may be broken. The samplers themselves should be pretty much identical.

Also make note of whether you were running flash attention or pytorch attention on automatic since those change image generation. I have a build of xformers + flash attention 2.3.6 dev for torch 2.1.2+cu121 / cp311 / Win64 in a repo if you need it since they're not building wheels for windows right now.