kyutai-labs / moshi

Apache License 2.0
6.73k stars 520 forks source link

Mimi Model Quantizer Configuration #144

Closed WWWWxp closed 3 weeks ago

WWWWxp commented 3 weeks ago

Due diligence

Topic

The paper

Question

Hello, I would like to ask a question. In the paper, we see that the Mimi model has Q=8 quantizers, but in the open-source Mimi model on Hugging Face, the default setting is num_quantizers=32. Are the configurations of the open-source Mimi and the one in the paper different?

hdmjdp commented 1 week ago

same question

LaurentMazare commented 1 week ago

We have different versions of mimi, typically with 8, 16, and 32 quantizers. The open-source release has the full 32 quantizers though when using moshi we only generate the first 8 levels so have a slightly degraded audio quality.