Closed FSSRepo closed 9 months ago
Nice to know we're thinking alike! I just added support for the Latent Consistency Models Sampler, as mentioned here: https://github.com/leejet/stable-diffusion.cpp/discussions/76#discussioncomment-7600422. Currently, I'm working on implementing support for lora, similar to sd-webui. Once that's done, lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.
Nice to know we're thinking alike! I just added support for the Latent Consistency Models Sampler, as mentioned here: https://github.com/leejet/stable-diffusion.cpp/discussions/76#discussioncomment-7600422. Currently, I'm working on implementing support for lora, similar to sd-webui. Once that's done, lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.
I just wanted to talk to you about changing the project structure to fully support the gguf format and applying the structure used by other projects that use ggml, like llama.cpp and whisper.cpp. This should allow having a general struct sd_context
, unet_model
, autoencoder_kl
clip_text_model
, sd_model_loader
sd_sample
to decouple each component.
// general full loader
sd_load_model(sd_context* ctx, const char* file_name);
// for custom unets
sd_load_unet(sd_context* ctx, ...
// custom vae
sd_load_vae(sd_context* ctx, ...
sd_load_lora(sd_context* ctx,
sd_model_loader* loader, const char* file_name, float strenght);
I have also conducted research on LoRA that could be helpful to you for both LoRA and SDXL support. Please review the code of my pull request.
@FSSRepo I've replied within your pull request. Sorry for the delayed response, I've been busy with work recently.
lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.
that is some big news :)
I just added LoRA support, now we can specify the use of LCM-LoRA through the prompt. Any SD1.x model should be adaptable. https://github.com/leejet/stable-diffusion.cpp#with-lora
Here's a simple example:
./bin/sd -m ../models/v1-5-pruned-emaonly-ggml-model-f16.bin -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1
I just added LoRA support, now we can specify the use of LCM-LoRA through the prompt. Any SD1.x model should be adaptable. https://github.com/leejet/stable-diffusion.cpp#with-lora
I will adapt your code for the PR, just the loader, I just need to offload the encoder phase of VAE to the GPU, although it requires me to create a function in ggml to pad the data in a tensor and create it's kernels. because ggml_map_custom cannot interact directly with the data when the CUDA Backend is used.'
I will adapt your code for the PR.
Since you opened the PR, there have been several updates in my master branch. I'd appreciate it if you could rebase your branch onto my master to integrate these changes, rather than straightforwardly copying the code. This will streamline conflict resolution when merging your PR.
I will adapt your code for the PR.
Since you opened the PR, there have been several updates in my master branch. I'd appreciate it if you could rebase your branch onto my master to integrate these changes, rather than straightforwardly copying the code. This will streamline conflict resolution when merging your PR.
No problem!
@leejet Why use these asserts? Since sizeof(nb[0])
measures the size of the type, which is size_t nb
and is 8, it is not the same as float.
static void asymmetric_pad(struct ggml_tensor* dst,
const struct ggml_tensor* a,
const struct ggml_tensor* b,
int ith,
int nth,
void* userdata) {
assert(sizeof(dst->nb[0]) == sizeof(float));
assert(sizeof(a->nb[0]) == sizeof(float));
assert(sizeof(b->nb[0]) == sizeof(float));
Fix:
assert(dst->nb[0] == sizeof(float));
assert(a->nb[0] == sizeof(float));
assert(b->nb[0] == sizeof(float));
Cannot accept the latest master changes because it would overwrite the ggml submodule from mine to yours if I force the rebase.
Closed by https://github.com/leejet/stable-diffusion.cpp/pull/75 (full tested LoRA Loader)
In the last week, there has been a lot of talk about a new type of model Latent Consistency Models that significantly improves the performance and generation of stable diffusion with fewer steps.
It apparently works with a LoRA adapter that can be applied to any existing model. I'm not sure if there are any specific changes to the UNet architecture that need to be made, but what needs to be done is adding a new sampler LCM Solver.
After completing the CUDA acceleration support, which is almost finished, I will see if I can work on adding LoRA support. This will require a complete change in the current project structure. Following that, I'll add the new solver and conduct the necessary tests.