Comparison with Whisper

hi there, great work! I just wondering that can wavtokenizer be compared with whisper? since the Qwen-audio series uses whisper as an audio encoder, can wavtokenizer be used as an alternative, and where are its advantages and disadvantages?

thanks

The WavTokenizer can be applied to the Qwen-Audio series, as well as the recently introduced Mini-Omni and LLaMA-Omni series. For a comparison with Whisper, please refer to our previous response.

It is worth noting that, in contrast to Whisper, we believe that codec-based approaches hold greater potential for the future. The current challenge appears to lie in the WavTokenizer's encoder, which is not yet powerful enough—a limitation that we are actively working to address.

jishengpeng / WavTokenizer

Comparison with Whisper #27