Questions about training/inference implementation details?

rohitgandikota / sliders

Concept Sliders for Precise Control of Diffusion Models

MIT License

968 stars 76 forks source link

In the requirements.txt, the version of diffusers is 0.20.2, but it appears that the __call__ method in the https://github.com/rohitgandikota/sliders/blob/main/eval-scripts/generate_images_xl.py#L39 has been modified based on diffusers version 0.21.0 and above. The original implementation of diffusers is here https://github.com/huggingface/diffusers/blob/v0.21.0/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L544. What is the rationale behind this?
I found that the implementation of LORA is adapted from https://github.com/p1atdev/LECO/blob/main/lora.py, and it uses the '3clier' type of LoRA. What is the reason for this?
I see that most of the results are based on XL. How is the performance on SD1, and do you have any pre-trained models for it?
The SDEdit technique was discussed in #2.

huh, we are using diffusers-0.20.2 locally. Maybe they are backwards compatible? seems to be working with 0.20.2 as well
Yes, this work is from the authors of LECO and Erasing concepts (the base behind LECO). This is a technical writeup of LECO talking about more practical issues. We also added image based sliders and GAN based sliders. So to train such sliders, we had to use non-cross attentions layers (more visual layers). So we use 3clier to train LoRA on convolution layers and other resnet layers. The sliders we train are not based on cross attentions.
It performs pretty good on SDv1.x as well. In fact our large scale studies were done on SDv1.4. We can release some pertained models for them.
Yes

rohitgandikota / sliders