Hello, thank you for your great work on this repo!
I'm interested in using SpeechLM to produce text representations in addition to speech representations. Is there a simple way of doing this, similar to the code snippet that you provide here for speech representations? I found the forward_text function which seems like it should be doing this, but I couldn't figure out a way to easily produce src_tokens to feed in there.
Hello, thank you for your great work on this repo!
I'm interested in using SpeechLM to produce text representations in addition to speech representations. Is there a simple way of doing this, similar to the code snippet that you provide here for speech representations? I found the
forward_text
function which seems like it should be doing this, but I couldn't figure out a way to easily producesrc_tokens
to feed in there.Thank you!