Open nshmyrev opened 4 years ago
I understood there are no good matrix libraries for ARM for 16 bits. We have to quantize to 8 bits actually and use QNNPACK. Some day, maybe with Pytorch move.
LM and graphs could be quantized to 16 bits, even 10 bits.
Hey @nshmyrev, just curiosity. Is there any news about this feature? 🙂
In development branch https://github.com/alphacep/vosk-api/tree/vosk-new we support pytorch models with 8-bit quantization.
Wow, this is amazing news! I'll try to run an app from this branch on macOS and Linux.
@nshmyrev sorry, another quick question. Do you know when the version (and models) with quantization will be released?
@nshmyrev sorry, another quick question. Do you know when the version (and models) with quantization will be released?
no
So we can use bigger ones on mobile more efficiently