Closed WWWWxp closed 3 weeks ago
same question
We have different versions of mimi, typically with 8, 16, and 32 quantizers. The open-source release has the full 32 quantizers though when using moshi we only generate the first 8 levels so have a slightly degraded audio quality.
Due diligence
Topic
The paper
Question
Hello, I would like to ask a question. In the paper, we see that the Mimi model has Q=8 quantizers, but in the open-source Mimi model on Hugging Face, the default setting is num_quantizers=32. Are the configurations of the open-source Mimi and the one in the paper different?