Closed George0726 closed 8 months ago
I was looking into this a while ago. I did not find good approach to run 2 unets side by side in A1111 as the original diffusers impl provided by the paper author.
However, if we do not demand generation speed, X-adapter can be implemented using A1111's refiner mechanism. The first pass we do a SD1.5 generation and store the hidden states. The second pass when we generate with SDXL, we convert these hidden states to SDXL format with X-adapter and inject them.
However, X-adapter should probably be made into its own extension.
What need to change in this repo is probably the mechanism we use to filter ControlNet models. Currently ControlNet models are filtered based on active SD model's version.
According to my testing in https://github.com/showlab/X-Adapter/issues/25, X-Adapter can be think as improved hiresfix, with lowres pass using SD15 checkpoint and highres pass using SDXL model. The adapter_guidance_start decides how much noise we are adding to the first pass result and how many steps to run in second pass.
Using second pass alone (Starting from pure random noise) won't give satisfying result. And this approach is highly dependent on SD15 model used. If a feature cannot be correctly interpreted by SD15 model, the SDXL model likely won't do a large modification on it.
Overall, I do not think it will worth the effort to implement X-Adapter.
Currently, X-adapter has released its codes and model. X-adapter makes the Controlnet for SD1.5 compatible with SDXL. It will be a great improvement on the current Diffusion community! https://github.com/showlab/X-Adapter