roudimit / whisper-flamingo

[Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
https://arxiv.org/abs/2406.10082
Other
64 stars 2 forks source link