whisper全量微调相关问题

yeyupiaoling / Whisper-Finetune

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

Apache License 2.0

813 stars 129 forks source link

您好，想请教一下微调的相关问题：

通过首页的测试结果表格，可以看到使用aishell finetune的结果比使用wenetspeech finetune的结果在test_meeting上的效果还要好。这个不太符合常理吧？毕竟wenetspeech数据量非常大。从这个结果看的话，wenetspeech finetune之后没有什么优势啊？
为什么没有开放全量微调的代码？whisper基础模型虽然能识别中文，但是不是说还是更擅长英文吗，所以如果目标业务场景是中文的话，且同时存在大量的数据比如wenetspeech数据，是不是就可以全量微调一个汉化版本的whisper模型？楼主能否开放全量微调的加速训练版本：）？

yeyupiaoling / Whisper-Finetune

whisper全量微调相关问题 #76