Quantization for speed boost?

myshell-ai / MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

MIT License

4.5k stars 561 forks source link

Quantization for speed boost? #46

Closed jadechip closed 5 months ago

jadechip commented 7 months ago

I've been trying to quantize the Linear and Convolutional layers to try and speed up model inference but I am getting mixed results, this is what I am doing at the moment:

quantized_model = torch.quantization.quantize_dynamic(
    model,
    {torch.nn.Linear, torch.nn.Conv1d},  # Add other layers as needed
    dtype=torch.qint8,
    inplace=True
)

Does anyone have any advice?

AngelGuevara7 commented 5 months ago

Hi! I was thinking about doing quantization also, but I'm a bit lost. Did you find a way to do it? Thanks!!