AIGText / GlyphControl-release

[NeurIPS2023] This is the official code of the paper "GlyphControl: Glyph Conditional Control for Visual Text Generation"
MIT License
203 stars 13 forks source link

Is there a way to use this with multicontrolnet in A1111 #2

Closed ghpkishore closed 1 year ago

ghpkishore commented 1 year ago

i wanted to know if there is a way for us to use your controlnet model over SD v1.5 or similar architecture based models and add glyph controlnet on top of that. (similar to how multicontrolnet is being used right now). Please let us know if it is possible or not, and if so , how can one go about implementing it?

Thanks

PkuRainBow commented 1 year ago

@yukang123 @guidongnan Please help to answer the question.

yukang123 commented 1 year ago

Hi @ghpkishore, thanks for your questions!

Since our GlyphControl framework is mainly based on the ControlNet. It is possible to transfer our controlnet model onto other stable diffusion models. To do so, it is necessary to extract the model parameters of ControlNet part (i.e., glyph controlnet in our scenario) from the released checkpoints as the ControlNet v1.1 does. Then you could use these parameters as one type of control onto other SD models. (p.s. As for the checkpoints trained on TextCaps 5K, we also finetune the U-Net decoder of original SD model. Thus, it may not suitable to transfer these checkpoints onto other models as you suggested)

Although it is doable, I could not gaurantee the outcomes. Since our model was trained based on SD v2.0, it may bring differences while using other SD models. Besides, the effects of combining our glyph controlnet with other controlnet models using canny maps or segmentation maps, are still worth exploring.

ghpkishore commented 1 year ago

Hi @yukang123

Thanks for your response. Appreciate it. You mentioned it is possible to get the extract the controlnet part of the model out. Can you help me with the script which I need to use for it to work. Also I saw from the checkpoints on HF that they are roughly 6.67 GB and wanted to know how big would be the extracted controlnet.

I agree that it is trained on V2 and hence might not work well with other models, but I am willing to try. There are controlnet trained with V2, and it would be fun to mix and match both of these to see how it works in a multi control setting.

My goal is to get a .pth file similar to controlnet v1.1 models (https://huggingface.co/lllyasviel/ControlNet-v1-1/tree/main) so that I can directly run it with the existing controlnet webui from A1111. This is the note from the controlnet repo owner for using custom models:

Note: If you download models elsewhere, please make sure that yaml file names and model files names are same. Please manually rename all yaml files if you download from other sources. (Some models like "shuffle" needs the yaml file so that we know the outputs of ControlNet should pass a global average pooling before injecting to SD U-Nets.)

Request you to let me know how to proceed. Thanks

yukang123 commented 1 year ago

The basic way is to load the entire the model and only save the parts related to the controlnet part. You could get referrence about how to load the model from the inference.py and cldm/cldm.py, and try to save the model parameters related to ControlNet (probably self.control_model in the cldm.py).

The whole procedure is simple to implement. You could try to extract the parameters and compare with the released controlnet v1.1 models.

ghpkishore commented 1 year ago

@yukang123 thanks. There is this script: https://github.com/Mikubill/sd-webui-controlnet/blob/main/extract_controlnet.py

Would this be able to extract the controlnet. If you think there is a possibility, then I will do so. This is the current script which is used to extract controlnet from v1.5 models

yukang123 commented 1 year ago

I think it would be very helpful for extraction. But I still suggest that you should read the codes first in case of some errors during extraction.

In general, the script you mentioned should be fine.