vitoplantamura / OnnxStream

Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK.
https://yolo.vitoplantamura.com/
Other
1.84k stars 82 forks source link

Safetensors? #67

Open KintCark opened 6 months ago

KintCark commented 6 months ago

Can we use safetensor models on this or only diffusers ?

vitoplantamura commented 6 months ago

hi,

HF Diffusers support loading safetensor files, so I don't think I quite understand your question :-)

More info here: https://github.com/vitoplantamura/OnnxStream#how-to-convert-and-run-a-custom-stable-diffusion-15-model-with-onnxstream-by-gaelicthunder

Vito

KintCark commented 6 months ago

I'm saying like if I wanted to use safetensor models from civitai would otherwise work I installed onnxstream on android but when I looked in the sdxl-turbo folder it's not a safetensor but a Diffuser file like it has the folders like text_encoder vae_decoder tokenizer etc..... but if I put an sdxl safetensor file which is one single file will it work?

vitoplantamura commented 6 months ago

no, the safetensor file needs to be first converted to the format compatible with OnnxStream.

This is not a simple procedure. The link I posted before explains how to do it for SD1.5. The procedure for SDXL is similar but it is still not trivial.

Vito

noah003 commented 5 months ago

no, the safetensor file needs to be first converted to the format compatible with OnnxStream. This is not a simple procedure. The link I posted before explains how to do it for SD1.5. The procedure for SDXL is similar but it is still not trivial. Vito

It's too difficult to convert the .safetensors model...

AeroX2 commented 3 months ago

I've taken a quick stab at this and made this: https://github.com/AeroX2/safetensor2onnx2txt

It currently only supports SDXL Turbo models but could probably be easily modified to support other models as a bonus it should also support Lora's although your milage may vary as I've only tested it one SDXL Turbo Lora

And it still needs a bunch more debugging as some models do not work at all and I have no idea why and I still need to support the tiling mode of OnnxStream (Actually looks like this works with just the unet conversion, no need to convert the vae)

vitoplantamura commented 3 months ago

really interesting!

I took a look at the code: the idea of ​​a single project that manages the entire conversion process is excellent! I plan to try it ASAP!

Thanks, Vito

vitoplantamura commented 2 months ago

hi AeroX2,

I finally found the time to try your project calmly and it works flawlessly.

I tried with a LoRA model for SDXL, modifying the code a bit (such as the size of the latents input to the UNET model).

It took 70 minutes on an Ubuntu VM on my 2018 laptop, with 16GB RAM + 16GB swap (16GB RAM wasn't enough, the simplification of the model was killed). It would probably take a lot less time if I reserved 32GB of RAM directly.

I hope to find some time in the future to fork the project and implement all cases, which, as you also wrote, should be a simple thing.

Thank you, Vito