Latent Consistency Models support would be great in this project

FSSRepo commented 9 months ago

In the last week, there has been a lot of talk about a new type of model Latent Consistency Models that significantly improves the performance and generation of stable diffusion with fewer steps.

It apparently works with a LoRA adapter that can be applied to any existing model. I'm not sure if there are any specific changes to the UNet architecture that need to be made, but what needs to be done is adding a new sampler LCM Solver.

After completing the CUDA acceleration support, which is almost finished, I will see if I can work on adding LoRA support. This will require a complete change in the current project structure. Following that, I'll add the new solver and conduct the necessary tests.

leejet commented 9 months ago

Nice to know we're thinking alike! I just added support for the Latent Consistency Models Sampler, as mentioned here: https://github.com/leejet/stable-diffusion.cpp/discussions/76#discussioncomment-7600422. Currently, I'm working on implementing support for lora, similar to sd-webui. Once that's done, lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.

FSSRepo commented 9 months ago

Nice to know we're thinking alike! I just added support for the Latent Consistency Models Sampler, as mentioned here: https://github.com/leejet/stable-diffusion.cpp/discussions/76#discussioncomment-7600422. Currently, I'm working on implementing support for lora, similar to sd-webui. Once that's done, lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.

I just wanted to talk to you about changing the project structure to fully support the gguf format and applying the structure used by other projects that use ggml, like llama.cpp and whisper.cpp. This should allow having a general struct sd_context, unet_model, autoencoder_kl clip_text_model, sd_model_loader sd_sample to decouple each component.

// general full loader
sd_load_model(sd_context* ctx, const char* file_name);

// for custom unets
sd_load_unet(sd_context* ctx, ...
// custom vae
sd_load_vae(sd_context* ctx, ...

sd_load_lora(sd_context* ctx,
 sd_model_loader* loader, const char* file_name, float strenght);

I have also conducted research on LoRA that could be helpful to you for both LoRA and SDXL support. Please review the code of my pull request.

leejet commented 9 months ago

@FSSRepo I've replied within your pull request. Sorry for the delayed response, I've been busy with work recently.

Green-Sky commented 9 months ago

lcm-lora will also be available. It's almost complete, and I expect to finish it in the next day or two.

that is some big news :)

leejet commented 9 months ago

I just added LoRA support, now we can specify the use of LCM-LoRA through the prompt. Any SD1.x model should be adaptable. https://github.com/leejet/stable-diffusion.cpp#with-lora

Here's a simple example:

./bin/sd -m ../models/v1-5-pruned-emaonly-ggml-model-f16.bin -p "a lovely cat<lora:lcm-lora-sdv1-5:1>" --steps 4 --lora-model-dir ../models -v --cfg-scale 1

FSSRepo commented 9 months ago

I just added LoRA support, now we can specify the use of LCM-LoRA through the prompt. Any SD1.x model should be adaptable. https://github.com/leejet/stable-diffusion.cpp#with-lora

I will adapt your code for the PR, just the loader, I just need to offload the encoder phase of VAE to the GPU, although it requires me to create a function in ggml to pad the data in a tensor and create it's kernels. because ggml_map_custom cannot interact directly with the data when the CUDA Backend is used.'

leejet commented 9 months ago

I will adapt your code for the PR.

Since you opened the PR, there have been several updates in my master branch. I'd appreciate it if you could rebase your branch onto my master to integrate these changes, rather than straightforwardly copying the code. This will streamline conflict resolution when merging your PR.

FSSRepo commented 9 months ago

I will adapt your code for the PR.

Since you opened the PR, there have been several updates in my master branch. I'd appreciate it if you could rebase your branch onto my master to integrate these changes, rather than straightforwardly copying the code. This will streamline conflict resolution when merging your PR.

No problem!

FSSRepo commented 9 months ago

@leejet Why use these asserts? Since sizeof(nb[0]) measures the size of the type, which is size_t nb and is 8, it is not the same as float.

static void asymmetric_pad(struct ggml_tensor* dst,
                               const struct ggml_tensor* a,
                               const struct ggml_tensor* b,
                               int ith,
                               int nth,
                               void* userdata) {
        assert(sizeof(dst->nb[0]) == sizeof(float));
        assert(sizeof(a->nb[0]) == sizeof(float));
        assert(sizeof(b->nb[0]) == sizeof(float));

Fix:

assert(dst->nb[0] == sizeof(float));
        assert(a->nb[0] == sizeof(float));
        assert(b->nb[0] == sizeof(float));

Cannot accept the latest master changes because it would overwrite the ggml submodule from mine to yours if I force the rebase.

FSSRepo commented 9 months ago

Closed by https://github.com/leejet/stable-diffusion.cpp/pull/75 (full tested LoRA Loader)

leejet / stable-diffusion.cpp

Latent Consistency Models support would be great in this project #80