Mikubill / sd-webui-controlnet

WebUI extension for ControlNet
GNU General Public License v3.0
16.78k stars 1.93k forks source link

[ControlNet 1.1] The updating track. #736

Closed lllyasviel closed 1 year ago

lllyasviel commented 1 year ago

We will use this repo to track some discussions for updating to ControlNet 1.1.

lllyasviel commented 1 year ago

all problems with lowvram and medvram fixed

MadaraxUchiha88 commented 1 year ago

all problems with lowvram and medvram fixed

Now I'll be able to update it again and use Shuffle :D Does the Style model still not work? Or was that fixed as well?

lllyasviel commented 1 year ago

yes. all fixed

MadaraxUchiha88 commented 1 year ago

yes. all fixed

Man, you guys are certified badasses :) I can't thank you enough :)

Rayregula commented 1 year ago

Sorry to bother everyone, I see in the readme that 1.1 is in beta (and is linked at the top of this thread in a different repo) but that does not mean that the master branch is where the beta is does it? Have been having issues all day (more like 8 hours, it's a slow process for me to see results so testing was also slow) trying to get openpose to work and am now wondering if I am running ControlNet 1.1 but using the old models.

I checked this thread earlier but it seemed like this thread was more progress until ControlNet 1.1 was updated pushed and I did not see anything that sounded like it has been pushed (I did not read the messages hidden from being such a massive thread.)

lllyasviel commented 1 year ago

Inpainting is supported!

Much more stable than a1111 inpaint because a controlnet is applied image

image

Rightnow only one mode "inpaint_global_harmonious" is supported. We will have another "inpaint_restrict" later (hopefully)

lllyasviel commented 1 year ago

this inpainting seems as robust as SD2 inpaint and dalle, but can directly used in all SD community models

catboxanon commented 1 year ago

but that does not mean that the master branch is where the beta is does it?

Everything has been being pushed to master in this repo, yes, lol.

If you want to revert back before all the 1.1 updates you can checkout 0f549888fd49aea48a4a5049f75c2e87ad3affad

lllyasviel commented 1 year ago

yes after the bad example of A1111 everyone begin to use master branch to do the job

Rayregula commented 1 year ago

Everything has been being pushed to master in this repo, yes, lol.

Oh thank you, I was in the process of trying a 1.1 model as I had tried everything else I could think of.

continue-revolution commented 1 year ago

Inpainting is supported!

Much more stable than a1111 inpaint because a controlnet is applied image

image

Rightnow only one mode "inpaint_global_harmonious" is supported. We will have another "inpaint_restrict" later (hopefully)

Great! My SAM extension will soon be connected to this mode of ControlNet (and hopefully semantic segmentation).

I’m curious about the difference between inpaint_global_harmonious and inpaint_restrict.

lllyasviel commented 1 year ago

inpaint_global_harmonious = diffusion inpaint_restrict = masked diffusion

global_harmonious is seamless but may change unmasked area a bit restrict may have seam but do not change unmasked area

"restrict" is very difficult to implement, hopefully we can have that later

lllyasviel commented 1 year ago

very robust, nearly no distortion in all cases

ControlNet: image

A1111 Native: image

lllyasviel commented 1 year ago

OMG i find ControlNet inpaint and A1111 inpaint can work together to achieve perfect seamless and super robust inpaint and at the same time do not change unmasked area

lllyasviel commented 1 year ago

like this image

image

catboxanon commented 1 year ago

Should compare with the SD inpainting model, other models aren't really designed for inpainting unless they were converted to an inpainting model (I believe this is the same formula you originally proposed for creating a "difference" controlnet model).

https://huggingface.co/runwayml/stable-diffusion-inpainting/blob/main/sd-v1-5-inpainting.ckpt

lllyasviel commented 1 year ago

that random seed problem is also fixed

lllyasviel commented 1 year ago

I think the updating of 1.1 is mostly done

catboxanon commented 1 year ago

This is with SD1.5 inpainting model for comparison, no controlnet used (couldn't find the example image you used in the 1.1 nightly repo). A lot better results than using a non-inpainting model. I'm sure the ControlNet is very beneficial to those models that aren't converted to one though. image

lllyasviel commented 1 year ago

I think all models are preprocessors are basically done. more tests needed

lllyasviel commented 1 year ago

not sure why but it seems that ip2p is only good at putting things on fire image image

this is make it ice image image

not very stable

catboxanon commented 1 year ago

For some reason I'm getting C++ errors for OpenCV at this line, I imagine it's not something you can reproduce?

https://github.com/Mikubill/sd-webui-controlnet/blob/7e08d2a87e70098f76c20836aa468a56090430d5/annotator/util.py#L33

cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\core\src\matrix.cpp:462: error: (-2:Unspecified error) in function '__cdecl cv::Mat::Mat(class cv::Size_<int>,int,void *,unsigned __int64)'
>  (expected: '_step >= minstep'), where
>     '_step' is 1929
> must be greater than or equal to
>     'minstep' is 329216

cv2.error: Unknown C++ exception from OpenCV code

Got both of those errors separately in different instances of trying to use the inpaint controlnet. I'll dig more into it and file a bug if you can't replicate that.

AbyszOne commented 1 year ago

Probably it can improve with controlnets, but standalone inpaint tend to make a mess with general proportions.

lllyasviel commented 1 year ago

perhaps problems with opencv version

catboxanon commented 1 year ago

Hm, seems like I only get it when I'm in the img2img tab and I use these options, particularly Only masked. Can you reproduce it if you use that?

image

catboxanon commented 1 year ago

Actually I understand how this is supposed to work now. I'm a bit used to the other way of inpainting.

The key thing here is for using the inpaint controlnet in the img2img tab, you need to use Whole picture and keep the resolution sliders at the same aspect ratio as the input. The only issue I see here is it prevents higher resolution inpaints being performed (normally you would use an inpainting model, use Only masked, and set your inpaint size to 512x512 or 768x768, and after inpainting the inpainted area would automatically be upscaled or downsampled and made to fit the original input resolution of the fullsize image).

You can sorta rectify this by enabling the options to output a composited mask in settings so it can output the inpainted part, and then you could layer that in external software over the original, fullsize image and somehow do your upscaling later.

So, all this to say that the OpenCV error is likely because the size is getting changed in the inpaint input, but not the controlnet input.

Edit: This has been fixed now! See this comment

lllyasviel commented 1 year ago

ip2p is better when used with multiple controls image

make it at night, magic lit, city, masterpiece, high-quality, extremely detailed Negative prompt: long body, low resolution, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 309724531, Size: 512x640, Model hash: abcaf14e5a, Model: anything-v3-full, Clip skip: 2, ControlNet-0 Enabled: True, ControlNet-0 Module: none, ControlNet-0 Model: control_v11e_sd15_ip2p [c4bb465c], ControlNet-0 Weight: 1, ControlNet-0 Guidance Start: 0, ControlNet-0 Guidance End: 1, ControlNet-1 Enabled: True, ControlNet-1 Module: lineart_anime, ControlNet-1 Model: control_v11p_sd15s2_lineart_anime [3825e83e], ControlNet-1 Weight: 1, ControlNet-1 Guidance Start: 0, ControlNet-1 Guidance End: 1

image

lllyasviel commented 1 year ago

it seems that inpaint only masks need fix

lllyasviel commented 1 year ago

inpaint only mask supported make sure that image resizing are same with a1111 or controlnet

image image

lllyasviel commented 1 year ago

Test:

Image1 image

Image2 image

For content we use lineart_anime For style we use shuffle

content 1 + style 2 image

content 2 + style 1 image

lllyasviel commented 1 year ago

I will end here today. i think most things are done. will not add any new things. will only fix bugs in the several days

lllyasviel commented 1 year ago

Tile work like this:

This is a super small image image

it is 100 * 141

then set tile annotator resolution to 768 set a1111 resolution 768 1024

(((masterpiece))), 1girl, high-quality, extremely detailed Negative prompt: long body, low resolution, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality Steps: 20, Sampler: Euler a, CFG scale: 7, Seed: 12345, Size: 768x1024, Model hash: abcaf14e5a, Model: anything-v3-full, ControlNet-0 Enabled: True, ControlNet-0 Module: tile_gaussian, ControlNet-0 Model: control_v11u_sd15_tile [1f041471], ControlNet-0 Weight: 1, ControlNet-0 Guidance Start: 0, ControlNet-0 Guidance End: 1

image

Note that this is 100% t2i with all diffusion steps without any ddim encoding or upscaling fix or i2i encoding

It should be possible to work with other i2i encoding or other upscaling. because it only use the controlnet pass, it does not occupy any other pass

lllyasviel commented 1 year ago

but still the tile model is somewhat limited and no evidence shows that it brings actual benefits. so still mark it as [u] model

[u] is unfinished

lllyasviel commented 1 year ago

The performance of ip2p can be significantly improved with a1111 inpaint

image

Make the hair green image

Make the hair into ice image

Make the hair red image

Make the girl bald ![image](https://user-images.githubusercontent.com/19834515/232428748-8ceec13d-014a-4795-a19b-8b2b2397f2fd.png)
lllyasviel commented 1 year ago

Tile is improved:

Input image

This is a super small image image

it is 100 * 141

set a1111 resolution to 1024 768 set controlnet tile Annotator resolution to 768 set controlnet tile Annotator to 768

1girl, (((masterpiece))), high-quality, extremely detailed Negative prompt: long body, low resolution, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality Steps: 20, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 3927140926, Size: 768x1024, Model hash: a074b8864e, Model: Counterfeit-V2.5_pruned, ControlNet-0 Enabled: True, ControlNet-0 Module: tile_gaussian, ControlNet-0 Model: control_v11u_sd15_tile [1f041471], ControlNet-0 Weight: 1, ControlNet-0 Guidance Start: 0, ControlNet-0 Guidance End: 1

Below results are 100% full step one-pass t2i without any other upscaling helpers

image

each image is 768 1024

catboxanon commented 1 year ago

Excellent work!!

Nziner commented 1 year ago

That's great! You've done an amazing job. Where can I leave a donation for you and Mikubill?

catboxanon commented 1 year ago

I'm not sure if you've tried it yet @lllyasviel, but the tile controlnet seems like it would benefit something like the included SD upscale script or Ultimate SD Upscale extension (which is a modification of the former). Both of these scale in tiles to save VRAM but quickly start to break down if the denoise is too high because the prompt doesn't match the contents, which is exactly what you propose the tile controlnet can solve. https://github.com/lllyasviel/ControlNet-v1-1-nightly#controlnet-11-tile-unfinished

Right now ControlNet feeds the entire image for this and not a matching crop of each tile when running with the script. I haven't yet looked into how it could be possible to integrate it with such a script but maybe it's something you'd want to poke around with.

AbyszOne commented 1 year ago

Second tile example looks less coherent/detailed (Cloth, Hair, Back), and preserve less original colors than first example (barely any green, while background is green).

lllyasviel commented 1 year ago

both tile and inpainting need better way to post process each diffusion iteration but it seems difficult to implement

IndolentKheper commented 1 year ago

Weird bug with the lineart_anime annotator results causes random black spots on some images: lineartanimeblackspots With the regular lineart preprocessor there are no problems with the input sketch. However, with the lineart_anime preprocessor, random black spots appear on the image which affect the output (the black spots have their output improvised since the lineart is missing). Changing the input image by upscaling, denoising, etc., strangely doesn't help the problem much. It doesn't happen to every image, but I've been able to recreate the problem with a lot of different images.

Another problem I've been seeing, though this is something Stable Diffusion generally does to mess up faces, is the lineart on faces goes straight to hell when it's not a close-up of a single character. derphorror If a single character from the same sketch input is isolated and brought in for a close-up, the lineart for the face stays really close to the original sketch. But with the same sketch zoomed out, it only makes derp-horror faces. I've been able to recreate this reliably with full-body inputs which always result in messed-up face lineart, but work perfectly if the same image is used from the waist up.

lllyasviel commented 1 year ago

lineart means extract lineart from illustrations. if your image is lineart, your preprocessor should be none, and set invert input color

alelordelo commented 1 year ago

@lllyasviel , I trained a new ControlNet following your circles tutorial: https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md

Then I tired to extract the generated checkpoint with @Mikubill extract_controlnet.py script, but I got this error below. I am using a checkpoint from the lightning_logs folder (from epoch 1). is this correct?

──────────────────────╮ │ /content/drive/MyDrive/stable-diffusion-webui-colab/stable-diffusion-webui/e │ │ xtensions/sd-webui-controlnet/extract_controlnet.py:25 in │ │ │ │ 22 │ │ state_dict = {k.replace("control_model.", ""): v.to(dtype) for │ │ 23 │ │ │ 24 │ if args.dst.endswith(".safetensors"): │ │ ❱ 25 │ │ save_file(state_dict, args.dst) │ │ 26 │ else: │ │ 27 │ │ torch.save({"state_dict": state_dict}, args.dst) │ │ 28 │ │ │ │ /usr/local/lib/python3.9/dist-packages/safetensors/torch.py:71 in save_file │ │ │ │ 68 │ save(tensors, "model.safetensors") │ │ 69 │ ``│ │ 70 │ """ │ │ ❱ 71 │ serialize_file(_flatten(tensors), filename, metadata=metadata) │ │ 72 │ │ 73 │ │ 74 def load_file(filename: str, device="cpu") -> Dict[str, torch.Tensor]: │ │ │ │ /usr/local/lib/python3.9/dist-packages/safetensors/torch.py:221 in _flatten │ │ │ │ 218 │ ptrs = defaultdict(set) │ │ 219 │ for k, v in tensors.items(): │ │ 220 │ │ if not isinstance(v, torch.Tensor): │ │ ❱ 221 │ │ │ raise ValueError(f"Key{k}is invalid, expected torch.Te │ │ 222 │ │ │ │ 223 │ │ if v.layout == torch.strided: │ │ 224 │ │ │ ptrs[v.data_ptr()].add(k) │ ╰──────────────────────────────────────────────────────────────────────────────╯ ValueError: Keyepoch` is invalid, expected torch.Tensor but received <class 'int'>

IndolentKheper commented 1 year ago

A lot of pencil sketches on paper don't quality as proper lineart though, so they often benefit from using a preprocessor to remove artifacts and shadows that appear when the paper is scanned or photographed.

lllyasviel commented 1 year ago

cn lineart 1.1 is trained to handle noisy linearts

Echolink50 commented 1 year ago

Not sure if this is the right place but do inpaint and tile have a certain pre-processor to use?

Rayregula commented 1 year ago

Not sure if this is the right place but do inpaint and tile have a certain pre-processor to use?

Not sure about tile.. but for inpaint I would expect it to be "inpaint_global_harmonious"?

bropines commented 1 year ago

@lllyasviel Is it possible to somehow improve the control net using T2L adapters or CoAdapter from Tencent Here is the link: https://github.com/TencentARC/T2I-Adapter

Echolink50 commented 1 year ago

Not sure if this is the right place but do inpaint and tile have a certain pre-processor to use?

Not sure about tile.. but for inpaint I would expect it to be "inpaint_global_harmonious"?

Ahhhh thanks. I didn't even notice inpaint_global_harmonious. Wonder about tile though.

2blackbar commented 1 year ago

I dont think the tile model does anything, same results with or without it here