Closed shinshoji01 closed 2 weeks ago
The streaming codes should be roughly in line with the non-streaming ones and I guess the difference you're seeing is likely due to numerical instability around the 1d convolution implementations (I just gave a try at this sample file) and got no differences on the semantic tokens but got some differences at higher levels. I wouldn't know of an easy way to get exactly the same values.
The streaming codes should be roughly in line with the non-streaming ones and I guess the difference you're seeing is likely due to numerical instability around the 1d convolution implementations (I just gave a try at this sample file) and got no differences on the semantic tokens but got some differences at higher levels. I wouldn't know of an easy way to get exactly the same values.
OK, thank you very much. I will use a non-streaming version.
Due diligence
Topic
The PyTorch implementation
Question
Hi, I'm using Mimi's semantic tokens in my research. I want to extract the semantic tokens in advance. Since extracting semantic tokens in a streaming way is time-consuming, I want to do it in parallel. However, when I extracted tokens in a streaming and non-streaming way, I got slightly different values for those two. Is there any way to obtain the same values? Can I obtain the same value as a streaming way in parallel?
How to reproduce this issue: