louiskirsch / speechT

An opensource speech-to-text software written in tensorflow
Apache License 2.0
157 stars 36 forks source link

tuneable parameters like word_count and valid_word_count_weight #13

Closed arpit601 closed 7 years ago

arpit601 commented 7 years ago

@timediv what could be the range of value for these parameters ? Is there any article where i can read more on it and can set them ?

arpit601 commented 7 years ago

@timediv

louiskirsch commented 7 years ago

The KenLM language model results in longer sentences being less likely than shorter sentences. To offset that, three parameters are introduced: word_count_weight: additional score added for every new word in the beam valid_word_count_weight: additional score for every new word in the beam that exists in the vocabulary language_model_weight: the weight of the logprob the language model predicts within the beam scoring

arpit601 commented 7 years ago

@timediv In most of the papers, these all parameters are optimized using a validation set . How can I do that without choosing the values myself?

louiskirsch commented 7 years ago

I optimized those parameters on a validation set using speecht-cli search (local search). The default parameters are therefore optimized already.