wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
4.14k stars 1.07k forks source link

More deletion error when using TLG decode #929

Closed cliffchen123 closed 2 years ago

cliffchen123 commented 2 years ago

When using TLG to decode, I try to rise acoustic_scale. Although it bring better CER, it also caused more deletion error. Any good method can adjust weight of acoustic and graphic without causing deletion error?

robin1001 commented 2 years ago

Adding LM always results in more deletion error.

Arrivederci commented 2 years ago

Adding LM always results in more deletion error. I meet the same problem, Could you explain why LM always leads to more deletion error? Thanks

cliffchen123 commented 2 years ago

@Arrivederci I solved this problem. By adding deletion penalty in decode process to reduce the deletion error.

Arrivederci commented 2 years ago

@cliffchen123 Did you use the code in wenet runtime? I can't find where to add a del penalty

cliffchen123 commented 2 years ago

@Arrivederci I add deletion penalty to graph_cost of wfst decoding. @robin1001 Do you think it is necessary to propose a PR?

Yymax-max commented 2 years ago

Adding LM always results in more deletion error. I also meet the same problem.can you give some idea ,thank

kangj13 commented 2 years ago

@Arrivederci I add deletion penalty to graph_cost of wfst decoding. @robin1001 Do you think it is necessary to propose a PR?

I met same problem, can you propose a PR? Thanks

Arrivederci commented 2 years ago

@Arrivederci I add deletion penalty to graph_cost of wfst decoding. @robin1001 Do you think it is necessary to propose a PR?

I met same problem, can you propose a PR? Thanks

@cliffchen123 that would be very helpful

robin1001 commented 2 years ago

@Arrivederci I add deletion penalty to graph_cost of wfst decoding. @robin1001 Do you think it is necessary to propose a PR?

please make a PR to show how it works.