what is the input when inference for encoding?

ZhangXInFD / SpeechTokenizer

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

https://0nutation.github.io/SpeechTokenizer.github.io/

Apache License 2.0

466 stars 40 forks source link

what is the input when inference for encoding? #7

Closed Edwardmark closed 5 months ago

Edwardmark commented 5 months ago

what is the input when inference for encoding? I think only raw audio is the input, no stft or mel spectrum is needed for inference, is that right?

ZhangXInFD commented 5 months ago

Yes, it is right。

Edwardmark commented 5 months ago

@ZhangXInFD Thanks for your quick and helpful reply. Your work is really great!