Open BenoitWang opened 2 months ago
Hi, I tested and found that WavLLM can sometimes understand audio sounds too. Seeing that all the training data mentioned in the paper are speech-related, I just wonder where comes this capability please?
i guess that the whisper encoder maybe have the capability to process audio, and the feature space of whisper for audio and speech are closed.
Hi, I tested and found that WavLLM can sometimes understand audio sounds too. Seeing that all the training data mentioned in the paper are speech-related, I just wonder where comes this capability please?