Closed xyzzart closed 9 months ago
Yes, repo have separate branch for that https://github.com/seruva19/kubin/tree/kandinsky-3 but you probably dont want it anyway, because text encoder model is huge and require A LOT of VRAM (40+ GB approx)
Update: they pruned text encoder to FP16 and divided by few files of 5GB each https://huggingface.co/kandinsky-community/kandinsky-3/tree/main/text_encoder now may be we have a chance
I am researching methods to enable running K3 on mid-range consumer GPUs (16 Gb and lower), specifically through 4-bit quantization of the encoder, but have not yet achieved success. But there are a lot of people in the field that are smarter than me, so I'm confident this issue will be solved, by me or someone else. π
Error no file named diffusion_pytorch_model.bin found in directory models\models--kandinsky-community--kandinsky-3\snapshots\0e855b07c78f7284752814a95ccb55d27392b866\movq.
kandinsky3 not work
I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway)
![Screenshot 2023-12-10 172258](https://github.com/seruva19/kubin/assets/26826215/f7988a34-46a4-4909-9c8f-f1ca61ab4508) ![Screenshot 2023-12-10 172145](https://github.com/seruva19/kubin/assets/26826215/794ffed9-4741-4a5e-b444-1b31fa2b7e57)
So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now on I'm going to focus on some of my other projects, but I might return to 'Kubin' later.
I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway)
Details So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now om I'm going to focus on some of my other projects, but I might return to 'Kubin' later.
good news π , i will try it
Here's a link to an article about optimisation of PixArt-alpha, not sure if it's informative for Kandinsky 3γIt's an optimisation script that optimises memory and video memory to a great extent!
https://www.felixsanz.dev/articles/pixart-a-with-less-than-8gb-vram
I was able to run inference on Paperspace's RTX5000 machines (30Gb RAM, 16 Gb GPU), using 'Enable sequential CPU offload' option. It was... somewhat slow, but anyway) Details So I merged the K3 inference code into the main branch, even though I only managed to get txt2img working, but I'm a little bit tired of all this stuff :| From now om I'm going to focus on some of my other projects, but I might return to 'Kubin' later.
good news π , i will try it
please make it possible to mix images with a new kandisnsky model...also got an error while using model 2.1 in mixing mode(
@xyzzart As I know, Kandinsky 3 does not support image mixing out of the box.
Currently I am working on some other projects that are not related with Kandinsky, so do not expect any updates in close future. Maybe when new version of the model will be released and if it actually will be usable on consumer grade machines, I might resume development.
Concerning 2.1 error, are you getting it while working in colab or with local install? If the former, I think I've just fixed it now by updating flash-attention wheels.
@klossm Thanks, I've already read it yesterday :) Yes, there are some nice techs described there, but constant switching of submodels is not what I would call comfortable, as it heavily slows down inference π By the way, there is also a very solid implementation of running K3 on low-end GPUs mentioned here: https://github.com/seruva19/kubin/discussions/166
Right now, I've stopped working on the 'kubin' project, but I'm not abandoning it. There's a chance I'll revisit it later if nobody comes up with a better implementation.
Any updates with a new Kandisky 3 model ? Thanx in advance