microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.09k stars 113 forks source link

Text feature extraction using SpeechLM #68

Open wonjune-kang opened 6 months ago

wonjune-kang commented 6 months ago

Hello, thank you for your great work on this repo!

I'm interested in using SpeechLM to produce text representations in addition to speech representations. Is there a simple way of doing this, similar to the code snippet that you provide here for speech representations? I found the forward_text function which seems like it should be doing this, but I couldn't figure out a way to easily produce src_tokens to feed in there.

Thank you!