Int 4 quantization? - Githubissues

neonbjb / tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Apache License 2.0

12.99k stars 1.79k forks source link

Int 4 quantization? #526

Open fakerybakery opened 1 year ago

fakerybakery commented 1 year ago

Hello, I'm relatively new here, but I was wondering if 4-bit quantization is possible to speed up inference. I know that tortoise-tts-fast allows half precision, but I was wondering if it is possible to further quantize the models. Thank you

neonbjb commented 1 year ago

I'm also interested in whether this works!

fakerybakery commented 1 year ago

I wonder if @ggerganov could make tortoise.cpp?

thiswillbeyourgithub commented 10 months ago

balisujohn commented 9 months ago

https://github.com/balisujohn/tortoise.cpp

tortoise.cpp implementation underway :) contributions are welcome!