Open rkstgr opened 1 year ago
1.5 kbps is indeed two codebooks, sorry if a typo somewhere made you think it was 4, can you point me to what made you think we used multiple of 4 codebooks ?
Current paper version on arxiv (https://arxiv.org/abs/2210.13438) under 3.2 Residual Vector Quantization:
When doing variable bandwidth training, we select randomly a number of codebooks as a multiple of 4, i.e. corresponding to a bandwidth 1.5, 3, 6, 12 or 24 kbps at 24 kHz.
okay, that's a mistake ! thanks for pointing this out, we will fix it in the next revision of the paper.
Good catch @rkstgr
i was confused for a second thanks
❓ Questions
I don't understand how you come to the smallest bitrate of 1.5 kbps for the 24kHz model:
If I understand correctly, we take a multiple of 4 number of codebooks (4, 8, 12, ... so 4 would be the minimum), and we have 10 bits per codebook (2^10 = 1024 entries), and for the 24kHz model 75 latent codes per second, giving us the smallest possible bit rate: 4 10 bits 75 1/s = 3kbps
However, both the paper and the README state that the lowest bitrate is 1.5kbps. Looking at the bitrate progression (1.5, 3, 6, 12, 24), which doubles at each step, wouldn't that rather correspond to 2, 4, 8, 16, 32 codebooks being used? Maybe I am just misinterpreting or missing something, could you please clarify this point?