k2-fsa / icefall

https://k2-fsa.github.io/icefall/
Apache License 2.0
792 stars 267 forks source link

update speechio whisper ft results #1605

Closed yuekaizhang closed 2 weeks ago

yuekaizhang commented 3 weeks ago

Fine-tuning whisper-large-v2 using multi-hans-zh dataset, exclude datatang 200h (which is not open sourced any more), updating wenetspeech (according to https://github.com/wenet-e2e/WenetSpeech/discussions/54)

Currently, rank 8 according to https://github.com/SpeechColab/Leaderboard

Rank 排名 Model 模型 CER 字错误率 Date 时间
1 ximalaya_api_zh 1.72% 2023.12
2 aliyun_ftasr_api_zh 1.85% 2023.12
3 microsoft_batch_zh 2.40% 2023.12
4 bilibili_api_zh 2.90% 2023.09
5 tencent_api_zh 3.18% 2023.12
6 iflytek_lfasr_api_zh 3.32% 2023.12
7 aispeech_api_zh 3.62% 2023.12
8 whisper-large-ft-v1 4.45% 2024.04
9 baidu_pro_api_zh 7.29% 2023.12
Split Greedy Search
Datasets
alimeeting eval 23.45
alimeeting test 25.42
aishell-1 dev 0.78
aishell-1 test 0.83
aishell-2 dev 2.75
aishell-2 test 2.93
aishell-4 test 17.11
magicdata dev 2.68
magicdata test 2.33
kespeech-asr dev phase1 4.97
kespeech-asr dev phase2 2.02
kespeech-asr test 6.34
WenetSpeech dev 5.06
WenetSpeech test meeting 8.38
WenetSpeech test net 6.94
JinZr commented 3 weeks ago

thank you so much!

i'll look into this pr tonight!

JinZr commented 2 weeks ago

Thank you yuekai!

I left a few comments at the PR, please check those at your convenience.

best jin

yuekaizhang commented 2 weeks ago

Thank you yuekai!

I left a few comments at the PR, please check those at your convenience.

best jin

@JinZr Done. Thanks.