QwenLM / Qwen-Audio

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
Other
1.48k stars 107 forks source link

Performance degradation on known langauges of whisper #13

Open akashicMarga opened 11 months ago

akashicMarga commented 11 months ago

I was trying to do a chat over a audio. Whisper-v2 audio output transcription was good like near perfect but the transcrition output of qwen did not capture whole transcription. I was trying it on hindi audio assuming whisper performance for hindi very good. Next i tried summarising audio but that also did not give proper results. Anyone who has tried qwen for languages supported by whisper?

NeonBohdan commented 10 months ago

It uses only whisper encoder and qwen decoder It was trainedon English, Chinese and some other languages, listed in the paper Only English Chinese German Spanish French Italian Japanese Korean

I haven't seen Hindi there