YuanGongND / whisper-at

Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
BSD 2-Clause "Simplified" License
312 stars 25 forks source link

Possible use faster-whisper (https://github.com/guillaumekln/faster-whisper) backend? #1

Open tensorboy opened 1 year ago

YuanGongND commented 1 year ago

hi there,

Thanks so much for pointing this out.

Yes, I have used faster-whisper for another project, but am not familiar with CTranslate2. I believe whisper-at can be backed by CTranslate2 as we used the same Transformer block implementation as the original Whisper, but it is unlikely that I will have time to implement that.

-Yuan

Ar770 commented 6 months ago

@YuanGongND, I want to make a PR with a faster-whisper implementation. Could you please highlight what needs to be done to implement it?

dgoryeo commented 5 months ago

@Ar770 , @YuanGongND , faster-whisper is a great idea. Did you by any chance get to implement that already?

YuanGongND commented 5 months ago

@dgoryeo

Thanks. It is a good idea, but I won't have time to implement (we are not a company). Third-party implementation is very welcome.

-Yuan

dgoryeo commented 5 months ago

Fully understand. I'm not an ML engineer otherwise I am keen to help.