danpovey / pocolm

Small language toolkit for creation, interpolation and pruning of ARPA language models
Other
90 stars 48 forks source link

Compute sentence prob #93

Closed DongjiGao closed 5 years ago

DongjiGao commented 6 years ago

Add get_sentence_prob.py that can compute sentence probability from input text file given language model in arpa form. The result would be written to an output file.

danpovey commented 6 years ago

I'm wondering whether it might make more sense to add this in Kaldi instead? It seems to me that it's independent of where the LM comes from. Remind me why we needed this. Also, please use SRILM to verify that it's correct. SRILM has an option to compute the probs per word.

On Thu, Jul 5, 2018 at 4:51 PM, DongjiGao notifications@github.com wrote:

Add get_sentence_prob.py that can compute sentence probability from input text file given language model in arpa form. The result would be written to an output file.

You can view, comment on, or merge this pull request online at:

https://github.com/danpovey/pocolm/pull/93 Commit Summary

  • add get_sentence_prob.py that can compute sentecne probability given arpa lm
  • small fix on output format

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/93, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVuzS9yt-TAdd8OJKZRV2VeTBQ8Dowks5uDnxsgaJpZM4VEhiW .

DongjiGao commented 6 years ago

Xiaohui needs a script to compute sentence probability like get_data_probs.py in pocolm. https://github.com/danpovey/pocolm/issues/92 I have checked several sentences by hand and they seem correct. I will check with SRILM.

danpovey commented 6 years ago

OK, thanks. When you've checked it, though, please show it to Xiaohui and he can incorporate it into what he's doing directly, as a script that's in Kaldi. I think this may make more sense.

On Thu, Jul 5, 2018 at 8:45 PM, DongjiGao notifications@github.com wrote:

Xiaohui needs a script to compute sentence probability like get_data_probs.py in pocolm.

92 https://github.com/danpovey/pocolm/issues/92

I have checked several sentences by hand and they seem correct. I will check with SRILM.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/danpovey/pocolm/pull/93#issuecomment-402891981, or mute the thread https://github.com/notifications/unsubscribe-auth/ADJVuzarbaJpmLe8aG4-oXqml8DPRQx9ks5uDrMagaJpZM4VEhiW .

DongjiGao commented 6 years ago

Will do. It might take me some time since I have not used SRILM before.