swaileh / transducersaurus

Automatically exported from code.google.com/p/transducersaurus
0 stars 0 forks source link

Reduce size of C in CMUSphinx model #8

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
If phoneset is large it can become enourmously big. Here is the patch to reduce 
some unneeded transitions

Original issue reported on code.google.com by nshmy...@gmail.com on 3 Apr 2011 at 10:05

GoogleCodeExporter commented 8 years ago
Well, after some poking around I sort of think that current transducersaurus is 
very bad on sphinx model. The reasons are:

1. In C transducer a lot of transitions are generated. If some triphone is 
missing in mdef it actually means it's not possible at all.
2. HMMs aren't properly tied. They need to be tied according to the senone 
sequence and not just by position.

Needs some work, maybe I'll comment in more detail later. For now the 
performance of the model is very bad comparing to the approach suggested in the 
issue #1.

Original comment by nshmy...@gmail.com on 3 Apr 2011 at 10:10

GoogleCodeExporter commented 8 years ago
Here is the final patch I've applied to create high-performance sphinx 
word-position cascade

Original comment by nshmy...@gmail.com on 8 Apr 2011 at 1:08

Attachments:

GoogleCodeExporter commented 8 years ago
Marking as fixed in this revision.  Current sphinx cd models should be smaller 
and more economical now.

Original comment by Josef.Ro...@gmail.com on 10 May 2011 at 4:15

GoogleCodeExporter commented 8 years ago
Re-opened just in case.

Original comment by Josef.Ro...@gmail.com on 12 May 2011 at 12:04