Closed tin2tin closed 8 months ago
Wuerstchen looks very cool but haven't read too much about it past the abstract! Hotshot-XL was trained to work with SDXL because of its balance between fidelity / prompt understanding.
There are some tips to getting Hotshot-XL running locally on low power hardware in the discord: https://discord.gg/85pqA3GG - join us!
Thank you. I've already joined, and what I tried to generate online looks really great. However, I don't see anything about how to bring down the VRam requirements?
I mentioned Wuerstchen only BC they found a way to compress the model, which makes it way smaller, faster and more hardware friendly and maybe applying these compression techniques to your project may make it more agile?
Have you joined the Carmenduru discord server(not mine)? There is a very active TXT2VID community there.
We'll be adding VRAM requirements to the Readme soon! There's a new --low_vram_mode
argument in the HotshotXLPipeline now which may help. All it does is move the text encoder and vae to the cpu when they are not needed. Memory hits 7.6gb now at fp16 when running the unet - but needs tested with smaller cards.
Thank you. I'm developing this Blender add-on on 6 GB VRAM and it is running Modelscope, Zeroscope and SDXL fine. Let me know if you at some point want me to test it on my low spec hardware. https://github.com/tin2tin/Pallaidium
With a few edits of the json I was able to get Diffusers to download the model, but I realized it was far too big to run on my hardware locally. Have you tried with the compression techniques utilized in the Wuerstchen project? https://huggingface.co/warp-ai/wuerstchen
Here's the json working with Diffusers: