PedroEstevesPT / kaldi_toy_example

Toy example to illustrate how to use kaldi recipes.
12 stars 6 forks source link

kaldi_toy_example

Toy example inspired by kaldi for dummies.

This tutorial is a very hands-on pratical introduction to kaldi (a modern toolkit used for ASR and other Speech Processing tasks). The only pre-requisite is having kaldi installed.

It just slightly deviates from the kaldi for dummies tutorial (https://kaldi-asr.org/doc/kaldi_for_dummies.html), having the data already prepared and adding an extra like getting the best transcriptions generated by the ASR system.

In order to train the model and decode after cloning the repository there is just 1 thing you should need to do: 1- cd into pedro_scripts and run "python format_wavscp.py" Then to train/decode/get results just: ./run.sh

You might need to change DATA_ROOT in path.sh if you did not clone this repo in the directory kaldi/egs.

To get the transcriptions generated by the ASR system type the following:

../../src/latbin/lattice-best-path ark:'gunzip -c exp/tri1/decode/lat.1.gz |' 'ark,t:| utils/int2sym.pl -f 2- exp/tri1/graph/words.txt > out.txt'

Directories:

Command to convert a directory of files to wav:

for f in *.m4a; do ffmpeg  -i "$f" "${f/%m4a/wav}"; done

Command to downsample(It is necessary to create an extra directory for the downsampling cause the same filename cannot be used as input and output file, otherwise an error will happen):

mkdir tmp; for file in *.wav; do sox ${file} -r 16000 ./tmp/${file}; done