k2-fsa / kaldi-decoder

Decoders from Kaldi using OpenFst
Apache License 2.0
24 stars 3 forks source link

Add "length_penalty" config option #7

Open FredSRichardson opened 9 months ago

FredSRichardson commented 9 months ago

This is a very small change that helps quite a lot I've found. The WeNet decoder has an option "length_penalty" which really helps control deletion rates which by default tend to be really high.

A good reference for the code change is in the WeNet code base. The relevant lines are linked below. Basically you add "length_penalty" (a float param with a negative value like -3.0 - so a bonus...) the following two places in ProcessEmitting() in kaldi-decoder/csrc/faster-decoder.cc and of course add the corresponding config option to kaldi-decoder/csrc/faster-decoder.h and kaldi-decoder/python/csrc/faster-decoder.cc:

runtime/core/kaldi/decoder/lattice-faster-decoder.cc#L781-L783

runtime/core/kaldi/decoder/lattice-faster-decoder.cc#L811-L813