darongmean / phonetisaurus

Automatically exported from code.google.com/p/phonetisaurus
0 stars 0 forks source link

Include the input word in the output when reading a list of words #13

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Using the provided model trained from CMUdict 0.7, provide a list of words, 
not all of which have more than one hypothesis, e.g.

phonetisaurus-g2p -m 
/data/src/sphinx/tools/phonetisaurus/script/g014b2/g014b2.fst -n 2 -t list1.txt

where list1.txt is

HELLO
ABBREVIATIONAL
GOODBYE

What is the expected output? What do you see instead?

There is only 1 hypothesis for ABBREVIATIONAL, so the output has 5 lines, not 
6, i.e.

11.9741 HH EH1 L OW0
13.732  HH AH0 L OW1
17.8924 AH0 B R IY2 V IY0 EY1 SH AH0 N AH0 L
14.8661 G UH2 D B AY1
18.3997 G UH1 D B AY1

To facilitate matching input and output for sets of words, it would be helpful 
to have the word on the output line, e.g.

HELLO 11.9741   HH EH1 L OW0
HELLO 13.732    HH AH0 L OW1
ABBREVIATIONAL 17.8924  AH0 B R IY2 V IY0 EY1 SH AH0 N AH0 L
GOODBYE 14.8661 G UH2 D B AY1
GOODBYE 18.3997 G UH1 D B AY1

What version of the product are you using? On what operating system?

Current checkout as at time of filing this issue, MacOS X Snow Leopard.

/data/src/sphinx/tools/phonetisaurus $ hg log
changeset:   33:f3c62a829cee
tag:         tip
user:        Josef Novak <josef.robert.novak@gmail.com>
date:        Sat Apr 16 17:33:44 2011 +0900
summary:     Added swapped+reversed test set, completing the full range of 
options in this regard.

Please provide any additional information below.

Original issue reported on code.google.com by smarqu...@gmail.com on 13 May 2011 at 1:54

GoogleCodeExporter commented 9 years ago
Sounds good, how about if I add this as an option?

Original comment by Josef.Ro...@gmail.com on 25 May 2011 at 3:14

GoogleCodeExporter commented 9 years ago
That would be fine, thanks.

Original comment by smarqu...@gmail.com on 25 May 2011 at 6:12

GoogleCodeExporter commented 9 years ago
This should be fixed now.  I added a '-o' option to the tool to print this 
information out.  It should work with the 'test list' as well as the 'test 
word' option, and work correctly with nbest.

Original comment by Josef.Ro...@gmail.com on 28 May 2011 at 8:38