Custom language model not working

The problem

I've trained a custom GMM/HMM model with custom data for Spanish and have rather decent results down to ~12% WER. When trying to use it with kaldi-gstreamer-server though, I get terrible results (in the form of almost no result at all).

As test input speech audio I'm using a sample (sample.wav) that the model decodes perfectly (on the decoding phase of training) to "todos tienen el derecho a la educacion" with:

steps/decode.sh  --cmd "$decode_cmd" exp/tri2b/graph data/test exp/tri2b_mmi_b0.05/decode_test

By using the python client to decode on the worker however:

python2 kaldigstserver/client.py -u ws://localhost:8080/client/ws/speech -r 88200 ./sample.wav

The output I get when using the model in kaldi-gstreamer-server and sending the exact same wave sample is: "<UNK> un < u". So obviously something is going terribly wrong.

Note that I'm using -r 88200 because sample.wav is 44.1k, 16-bit (is this reasoning correct?), though I've tried changing this value and the result is the same.

What I'm doing

Note that before trying to use my custom model, I first tested kaldi-gstreamer-server with the provided test english model and data and it worked (and continue to work) perfectly. It only fails when using my model.

This is how I'm setting this up:

I first run the training and then copy the files from the model training to a model directory to be used by kaldi-gstreamer-server:

All the files are available here: https://drive.google.com/open?id=1aVxiWBl-hGN3JJCuJWa47aYJuloJD06u

cp exp/tri2b_mmi_b0.05/final.mdl       /media/kaldi_models/spanish
cp exp/tri2b/final.mat                        /media/kaldi_models/spanish
cp exp/tri2b/graph/words.txt               /media/kaldi_models/spanish
cp exp/tri2b/graph/HCLG.fst              /media/kaldi_models/spanish

Then I write a config file: (NOTE: I'm actually using docker-kaldi-gstreamer-server here)

timeout-decoder : 10
decoder:
   model:     /opt/models/spanish/final.mdl
   lda-mat:   /opt/models/spanish/final.mat
   word-syms: /opt/models/spanish/words.txt
   fst:       /opt/models/spanish/HCLG.fst
   silence-phones: "1:2:3:4:5"
   beam: 13.0
out-dir: tmp

use-vad: False
silence-timeout: 60

# Just a sample post-processor that appends "." to the hypothesis
# post-processor: perl -npe 'BEGIN {use IO::Handle; STDOUT->autoflush(1);} s/(.*)/\1./;'
logging:
    version : 1
    disable_existing_loggers: False
    formatters:
        simpleFormater:
            format: '%(asctime)s - %(levelname)7s: %(name)10s: %(message)s'
            datefmt: '%Y-%m-%d %H:%M:%S'
    handlers:
        console:
            class: logging.StreamHandler
            formatter: simpleFormater
            level: DEBUG
    root:
        level: DEBUG
        handlers: [console]

Run the container:

docker run -it -p 8080:80 -v /media/kaldi_models:/opt/models jcsilva/docker-kaldi-gstreamer-server:latest /bin/bash

Start a worker within the container:

 /opt/start.sh -y /opt/models/spanish_worker.yaml

Use the python client to get a result:

python2 kaldigstserver/client.py -u ws://localhost:8080/client/ws/speech -r 88200 ./sample.wav

Is there some particular requirement on how the model should be trained in order for it to be suitable for kaldi-gstreamer-server? If that's not the case, what could be the problem here?

alumae / kaldi-gstreamer-server

Custom language model not working #186

The problem

What I'm doing