Closed sunnnnnnnny closed 2 months ago
I need to use the model for encoding 16k WAV files. Only i need to resample the speech to 16k before encoding, and save the 16k WAV file after decoding the speech token?
The input and output for the WavTokenizer are audio files sampled at 24 kHz. If your input audio is sampled at 16 kHz, it must be resampled to 24 kHz before being processed by the WavTokenizer. Similarly, if a 16 kHz output is required, simply resample the 24 kHz output audio back to 16 kHz.
I need to use the model for encoding 16k WAV files. Only i need to resample the speech to 16k before encoding, and save the 16k WAV file after decoding the speech token?