Loading Parlor at 16-bit

erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.

GNU Affero General Public License v3.0

864 stars 98 forks source link

Is your feature request related to a problem? Please describe. With Parlor-large being well... rather large, it would be nice to have the option to load it in (b)float16 instead of float32.

Describe the solution you'd like Add an option in Parlor settings to choose dtype: float32, float16, bfloat16.

Describe alternatives you've considered Actual quantization for even better memory savings. I don't know if any of them work right now though.

Additional context Effect on quality is difficult to determine due to the model's inherent variability, but I did not notice anything significant. I think the option to save 50% on memory is more significant than any minor quality loss. This is easy to implement, only dtype on the model loading needs to be changed.

erew123 / alltalk_tts

Loading Parlor at 16-bit #303