This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Hello, I used the checkpoint file you trained with librispeech to infer the Chinese audio and it still works well. Is that what you expected? Because your dataset doesn't seem to use Chinese, only English data.
Hello, I used the checkpoint file you trained with librispeech to infer the Chinese audio and it still works well. Is that what you expected? Because your dataset doesn't seem to use Chinese, only English data.