k2-fsa / snowfall

Moved to https://github.com/k2-fsa/icefall
Apache License 2.0
143 stars 42 forks source link

Constructing ctc decoding graph in a batch #225

Closed pkufool closed 3 years ago

pkufool commented 3 years ago

See comments #220

Thanks for doing the comparison, and sure, that's a good idea. Yes, we should introduce a special-purpose function that constructs a batch of CTC graphs from a ragged tensor consisting of the linear symbol sequences for each one.

Current functions in k2 support to construct ctc decoding graph in a batch, so I think there is no other things should be done in C++ side. Fix me if I understand in a wrong way.

danpovey commented 3 years ago

I think we were talking about doing it in a single function, rather than a sequence of functions. BTW, we'd have to decide which type of topology to use, in terms of how it deals with repeats of the same symbol (i.e. do we require a blank in between?). If we require a blank in between those repeats, the code becomes a little more complicated.

pkufool commented 3 years ago

I see, so we will implement a function in k2 that given a lexicon fsa and ragged tensor consisting of the linear symbol sequences as input and return the ctc graph.

danpovey commented 3 years ago

No, it would take just a sequence of phone symbols (or whatever symbols the user is using, not necessarily worsd), and return the linear CTC graph; I think that is his scenario. This would not support optional silences.

pkufool commented 3 years ago

Implements in the c++ side, see https://github.com/k2-fsa/k2/pull/776