Closed dlcrista closed 2 years ago
Not as a text file. You could turn your text files into a .trn formated file to accomplish this. Essentially, put the text all on a single line with an utterance id at the end.
@jfiscus so you would have something like this in your text file, right?
EH2 K S K L AH0 M EY1 SH AH0 N P OY2 N T (abc000)
K L OW1 Z K W OW1 T (abc001)
D AH1 B AH0 L K W OW1 T (abc002)
Follow up question, sclite match based on ID, right? Meaning there is no need to have the hyp and ref files aligned in order of utterances.
Say you have:
$ cat abc000
EH2 K S K L AH0 M EY1 SH AH0 N P OY2 N T
$ cat abc001
K L OW1 Z K W OW1 T
$ cat abc002
D AH1 B AH0 L K W OW1 T
Then you can create a trn file which has the contents you've listed:
$ cat all.trn
EH2 K S K L AH0 M EY1 SH AH0 N P OY2 N T (abc000)
D AH1 B AH0 L K W OW1 T (abc002)
K L OW1 Z K W OW1 T (abc001)
The records in all.trn
don't have to be in any order, as sclite will match by the abc00?
id.
Trn formats has a speaker and utterance id in the parens. Do this to make three utterances for speaker abc. order does not matter.
$ cat all.trn EH2 K S K L AH0 M EY1 SH AH0 N P OY2 N T (abc-000) D AH1 B AH0 L K W OW1 T (abc-002) K L OW1 Z K W OW1 T (abc-001)
Can I use sclite to compare two txt files?
I'm not comparing transcripts, just two plain text files