microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.21k stars 114 forks source link

Does the pre-trained model for hidden unit tokenizer use speaker embeddings? #73

Open Kodhandarama opened 9 months ago

Kodhandarama commented 9 months ago

Can you please elaborate on the role of speaker embeddings in the hidden unit tokenizer and what effect it has?