k2-fsa / snowfall

Moved to https://github.com/k2-fsa/icefall
Apache License 2.0
143 stars 42 forks source link

Begin to add small tools for snowfall #207

Closed csukuangfj closed 3 years ago

csukuangfj commented 3 years ago

From https://github.com/lhotse-speech/lhotse/pull/304#issuecomment-839408455

And we might want some way to represent those data alignments. E.g. one simple representation might just be the label sequences, indexed somehow by utterance-id. There are several different label-sequences that might be relevant here, depending on the type of system: the ilabel from the model, the phone-label "without repetitions", which we could store as a separate attribute on the graphs by using the "inner_labels=xxx" arg to compose in the appropriate stage of graph creation; and the olabel which is the word label. One possibility is to just store these as a dict indexed by utterance-id and then by 'ilabel', 'olabel' and 'phone_label' (for phone labels without repetitions) or something like that, and store it as a .pt file with torch.save().


From https://github.com/lhotse-speech/lhotse/pull/304#discussion_r645978753

Btw @csukuangfj we should make sure that Lhotse alignments and Snowfall alignments interact well together. If there are any additions/changes that are helpful on Lhotse side, please let us know.


I am adding a new class Alignment to implement the ideas from Dan. See https://github.com/k2-fsa/snowfall/blob/692d5c426e232e9391fd2580011ebfc6b1e0c035/snowfall/tools/ali.py#L11-L19

If sampling rate is available and if the alignment is frame-wise, I think it's very easy to convert it to CTM, like the one supported by Lhotse.

csukuangfj commented 3 years ago

The usage of the commanline tools is like lhotse.

  1. Install snowfall

    cd snowfall
    python3 setup.py install
  2. Invoke it

    
    $ snowfall --help
    Usage: snowfall [OPTIONS] COMMAND [ARGS]...
    
    Entry point to the collection of utilities in snowall.

Options: --help Show this message and exit.

Commands: ali Alignment tools in snowfall

$ snowfall ali --help Usage: snowfall ali [OPTIONS] COMMAND [ARGS]...

Alignment tools in snowfall

Options: --help Show this message and exit.

Commands: edit-distance Compute edit distance between two alignments.

$ snowfall ali edit-distance --help Usage: snowfall ali edit-distance [OPTIONS]

Compute edit distance between two alignments.

The reference/hypothesis alignment file contains a python object Dict[str, Alignment] and it can be loaded using torch.load. The dict is indexed by utterance ID.

The symbol table, if provided, has the following format for each line:

  symbol integer_id

It can be loaded by k2.SymbolTable.from_file().

Options: -r, --ref FILE The file containing reference alignments [required] -h, --hyp FILE The file containing hypothesis alignments [required] -t, --type TEXT The type of the alignment to use for computing the edit distance [required] -o, --output-file FILE Output file [required] -s, --symbol-table FILE The symbol table for the given type of alignment --help Show this message and exit.

csukuangfj commented 3 years ago

Ready for review.

danpovey commented 3 years ago

LGTM! Thanks for the very fast work!