half precision on diffuser - results

152334H / tortoise-tts-fast

Fast TorToiSe inference (5x or your money back!)

GNU Affero General Public License v3.0

771 stars 179 forks source link

half precision on diffuser - results #6

Open hesz94 opened 1 year ago

hesz94 commented 1 year ago

Diffuser can supposedly be switched to half precision when it's being built (line 239 in api.py) tried this on a 3090ti and got marginal (5% ish iirc) speed improvement - presumably this switch isn't properly implemented and it switches back to full precision. Will have to do a proper trace of the diffuser run to see exactly what's going on.

152334H commented 1 year ago

I didn't even notice that flag, wow.

in diffusion_decoder.py, use_fp16 is only used to assign DiffusionTts().enable_fp16, which in turn is only used here:

https://github.com/152334H/tortoise-tts-fast/blob/main/tortoise/models/diffusion_decoder.py#L303-306

I think the previous time I tried to wrap an autocast around the whole model, I got an error somewhere I couldn't fix. I'll pass forward the fp16 flag for the time being

hesz94 commented 1 year ago

Yeah I tried manually converting the entire model and failed too. It should be doable, just not going to be as easy as we'd wish