Closed tiennguyen12g closed 10 months ago
No, Whisper models do not have any notion of phonemes. They are end-to-end models that goes directly from the audio signal to subwords token (so letters). Having phoneme would require another extra models dedicated to this.
Hello everyone, if you have experienced in this case, please let me know. Thank you.