Open Ryu1845 opened 1 year ago
Actually this is probably irrelevant if you're switching to onnx for tensorrt.
Actually I want to reopen this, but with coreweave/tensorizer.
Why?
in general: tensorrt is probably going to take a while :upside_down_face:. Faster model loading would be very helpful for my print()
-debugging development process.
tensorizer vs safetensors: I was told the former was faster, but granted I haven't made any tests of that claim.
I can try to convert the weights in a couple hours if you don't want to do it yourself
that would be great
Ok, I gave it a go and I think it only works for Hugging Face transformers/diffusers models. I can serialize it, but it doesn't output a config JSON, so it can load it afterward.
Ah well I'll figure out how to deal with that later....
I was wondering how you'd fix some of the hot-loaded modules in the NNs (like the wpe gpt layers that are just stitched in on-the-fly right now)
Converting it to safetensors is actually really easy (it's pretty much two lines of code), so I can do that in the meantime, maybe.
Does converting make any difference in loading?
It makes it faster 😁
Here are the converted models btw https://huggingface.co/Gatozu35/tortoise-tts/tree/main It's missing the vocoder because the weights are not a state dict I think. I might add it later if needed.
As shown here, switch to safetensors would speed up model loading, which is a small but non-negligible gain.