alumae / kaldi-offline-transcriber

Offline transcription system for Estonian using Kaldi
Other
227 stars 57 forks source link

generating alignments #10

Open yasheshgaur opened 8 years ago

yasheshgaur commented 8 years ago

Hi,

Kaldi scripts usually also generate alignments with lattices. You have both lat..gz and ali..gz files.

While in the offline transcriber, we only have the lattices as outputs. Is there any way to also generate alignments?

Thanks!

alumae commented 8 years ago

Alignments in the form of CTM files can already be generated (see https://github.com/alumae/kaldi-offline-transcriber/blob/master/Makefile#L249). I.e., you may invoke

make build/output/foo.ctm

which generates a CTM file for src-audio/foo.mp3.

If you need alignments in other format (e.g. phone alignments), you may look inside the steps/get_ctm.sh file an modify it according to you needs).

vince62s commented 8 years ago

By the way, is there any existing script to convert ctm files into the "kaldi training data files" text, segments, utt2spk, spk2utt ?