It appears there might be a bug in the audio framing.
If you round hop_len, then for a long audio file, over time the rounding error will accumulate. It seems that you should leave it as a float, and round every time you use it.
For example, if you want embeddings every 25ms and have 44100Hz audio, then the sample hop length is 1102.5. In this
example, you will drift 25ms every 37 minutes of audio. This is just an example because the sample rate is 48000 here.
It appears there might be a bug in the audio framing.
If you round hop_len, then for a long audio file, over time the rounding error will accumulate. It seems that you should leave it as a float, and round every time you use it.
For example, if you want embeddings every 25ms and have 44100Hz audio, then the sample hop length is 1102.5. In this example, you will drift 25ms every 37 minutes of audio. This is just an example because the sample rate is 48000 here.