Convert models to safetensors

152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)

GNU Affero General Public License v3.0

784 stars 179 forks source link

Convert models to safetensors #14

Open Ryu1845 opened 1 year ago

Ryu1845 commented 1 year ago

As shown here, switch to safetensors would speed up model loading, which is a small but non-negligible gain.

Ryu1845 commented 1 year ago

Actually this is probably irrelevant if you're switching to onnx for tensorrt.

152334H commented 1 year ago

Actually I want to reopen this, but with coreweave/tensorizer.

Ryu1845 commented 1 year ago

Why?

152334H commented 1 year ago

in general: tensorrt is probably going to take a while :upside_down_face:. Faster model loading would be very helpful for my print()-debugging development process.

tensorizer vs safetensors: I was told the former was faster, but granted I haven't made any tests of that claim.

Ryu1845 commented 1 year ago

I can try to convert the weights in a couple hours if you don't want to do it yourself

152334H commented 1 year ago

that would be great

Ryu1845 commented 1 year ago

Ok, I gave it a go and I think it only works for Hugging Face transformers/diffusers models. I can serialize it, but it doesn't output a config JSON, so it can load it afterward.

152334H commented 1 year ago

Ah well I'll figure out how to deal with that later....

I was wondering how you'd fix some of the hot-loaded modules in the NNs (like the wpe gpt layers that are just stitched in on-the-fly right now)

Ryu1845 commented 1 year ago

Converting it to safetensors is actually really easy (it's pretty much two lines of code), so I can do that in the meantime, maybe.

zolero commented 1 year ago

Does converting make any difference in loading?

Ryu1845 commented 1 year ago

It makes it faster 😁

Ryu1845 commented 1 year ago

Here are the converted models btw https://huggingface.co/Gatozu35/tortoise-tts/tree/main It's missing the vocoder because the weights are not a state dict I think. I might add it later if needed.