Closed csukuangfj closed 3 years ago
The usage of the commanline tools is like lhotse.
Install snowfall
cd snowfall
python3 setup.py install
Invoke it
$ snowfall --help
Usage: snowfall [OPTIONS] COMMAND [ARGS]...
Entry point to the collection of utilities in snowall.
Options: --help Show this message and exit.
Commands: ali Alignment tools in snowfall
$ snowfall ali --help Usage: snowfall ali [OPTIONS] COMMAND [ARGS]...
Alignment tools in snowfall
Options: --help Show this message and exit.
Commands: edit-distance Compute edit distance between two alignments.
$ snowfall ali edit-distance --help Usage: snowfall ali edit-distance [OPTIONS]
Compute edit distance between two alignments.
The reference/hypothesis alignment file contains a python object Dict[str,
Alignment] and it can be loaded using torch.load
. The dict is indexed by
utterance ID.
The symbol table, if provided, has the following format for each line:
symbol integer_id
It can be loaded by k2.SymbolTable.from_file()
.
Options: -r, --ref FILE The file containing reference alignments [required] -h, --hyp FILE The file containing hypothesis alignments [required] -t, --type TEXT The type of the alignment to use for computing the edit distance [required] -o, --output-file FILE Output file [required] -s, --symbol-table FILE The symbol table for the given type of alignment --help Show this message and exit.
Ready for review.
LGTM! Thanks for the very fast work!
From https://github.com/lhotse-speech/lhotse/pull/304#issuecomment-839408455
From https://github.com/lhotse-speech/lhotse/pull/304#discussion_r645978753
I am adding a new class
Alignment
to implement the ideas from Dan. See https://github.com/k2-fsa/snowfall/blob/692d5c426e232e9391fd2580011ebfc6b1e0c035/snowfall/tools/ali.py#L11-L19If sampling rate is available and if the alignment is frame-wise, I think it's very easy to convert it to CTM, like the one supported by Lhotse.