seruva19 / kubin

Web-GUI for Kandinsky text-to-image diffusion models.
177 stars 18 forks source link

Kandinsky 3 model is out #161

Closed xyzzart closed 9 months ago

xyzzart commented 1 year ago

Any updates with a new Kandisky 3 model ? Thanx in advance

Blucknote commented 1 year ago

Yes, repo have separate branch for that https://github.com/seruva19/kubin/tree/kandinsky-3 but you probably dont want it anyway, because text encoder model is huge and require A LOT of VRAM (40+ GB approx)

Blucknote commented 1 year ago

Update: they pruned text encoder to FP16 and divided by few files of 5GB each https://huggingface.co/kandinsky-community/kandinsky-3/tree/main/text_encoder now may be we have a chance

seruva19 commented 1 year ago

I am researching methods to enable running K3 on mid-range consumer GPUs (16 Gb and lower), specifically through 4-bit quantization of the encoder, but have not yet achieved success. But there are a lot of people in the field that are smarter than me, so I'm confident this issue will be solved, by me or someone else. πŸ˜‰

kigy1 commented 1 year ago

Error no file named diffusion_pytorch_model.bin found in directory models\models--kandinsky-community--kandinsky-3\snapshots\0e855b07c78f7284752814a95ccb55d27392b866\movq.

kandinsky3 not work

seruva19 commented 11 months ago

I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway)

Details

![Screenshot 2023-12-10 172258](https://github.com/seruva19/kubin/assets/26826215/f7988a34-46a4-4909-9c8f-f1ca61ab4508) ![Screenshot 2023-12-10 172145](https://github.com/seruva19/kubin/assets/26826215/794ffed9-4741-4a5e-b444-1b31fa2b7e57)

So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now on I'm going to focus on some of my other projects, but I might return to 'Kubin' later.

kigy1 commented 11 months ago

I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway)

Details So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now om I'm going to focus on some of my other projects, but I might return to 'Kubin' later.

good news πŸ‘ , i will try it

klossm commented 10 months ago

Here's a link to an article about optimisation of PixArt-alpha, not sure if it's informative for Kandinsky 3。It's an optimisation script that optimises memory and video memory to a great extent!

https://www.felixsanz.dev/articles/pixart-a-with-less-than-8gb-vram

xyzzart commented 10 months ago

I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway) Details So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now om I'm going to focus on some of my other projects, but I might return to 'Kubin' later.

good news πŸ‘ , i will try it

please make it possible to mix images with a new kandisnsky model...also got an error while using model 2.1 in mixing mode(

seruva19 commented 10 months ago

@xyzzart As I know, Kandinsky 3 does not support image mixing out of the box.

Currently I am working on some other projects that are not related with Kandinsky, so do not expect any updates in close future. Maybe when new version of the model will be released and if it actually will be usable on consumer grade machines, I might resume development.

Concerning 2.1 error, are you getting it while working in colab or with local install? If the former, I think I've just fixed it now by updating flash-attention wheels.

seruva19 commented 10 months ago

@klossm Thanks, I've already read it yesterday :) Yes, there are some nice techs described there, but constant switching of submodels is not what I would call comfortable, as it heavily slows down inference 😐 By the way, there is also a very solid implementation of running K3 on low-end GPUs mentioned here: https://github.com/seruva19/kubin/discussions/166

Right now, I've stopped working on the 'kubin' project, but I'm not abandoning it. There's a chance I'll revisit it later if nobody comes up with a better implementation.