Hi there, I am trying to use a wav2vec encoder, which seems not supported by SLAM-LLM yet. but the training and test accuracy is very low for ASR task, only 30% training accuracy. The encoder uses last hidden layer embedding. Not sure if you have any idea or any plan of supporting this functionality.
🚀 The feature, motivation and pitch
Hi there, I am trying to use a wav2vec encoder, which seems not supported by SLAM-LLM yet. but the training and test accuracy is very low for ASR task, only 30% training accuracy. The encoder uses last hidden layer embedding. Not sure if you have any idea or any plan of supporting this functionality.
Alternatives
No response
Additional context
No response