Closed ZTMIDGO closed 1 year ago
I've been attempting to port the DDIM scheduler, but no luck so far. This is where I got stuck on my own implementation of SD for C#. What are you using for UINT8 quantization? My current goal is to get it running as fast as it can on the CPU.
@jdluzen I use this code for uint8 quantization https://github.com/LowinLi/stable-diffusion-streamlit/blob/main/src/stable-diffusion-streamlit/pages/model/quantization.py, Currently I'm trying to run sd on a mobile device using onnxruntime-android but it's very slow, on a Qualcomm 855 CPU it takes 5 seconds to run a step, I don't know how to make it run faster on an arm CPU at the moment, Nnapi doesn't do anything
Thanks, that's what I was doing as well. I'll keep trying to chip away at porting and see what happens. Do you know if LMS doesn't like INT8 in general? Or maybe we can make it INT8 aware?
@jdluzen lms is not suitable for unt8, you need to use eulerA or dpm solvent, you can refer to Python to write eulerA and dpm solvent
Well, EulerA is ported. It doesn't work, but it compiles and runs incorrectly 😅 I need to push it to a fork and then PR it to get some more eyes on it. I've got a ton of changes locally to sift through first.
Draft PR is available: https://github.com/cassiebreviu/StableDiffusion/pull/12
PR 12 is now working. The EulerA scheduler will be in the main branch soon!
The current LMS scheduler has poor quality for quantized UINT8 model generation.
Hopefully, DDIM and DPMS schedulers can be added, and DDIM and DPMS schedulers can improve build quality