Julius High Word Error Rate (WER)

Abdolrahman commented 8 years ago

Hello,

I installed Julius 4.3.1 on windows/Cygwin and used my own Arabic acoustic model HTK 3.4, corpus 8.5 hours, ngram LM, tied.list. The decoder is working well. However, the results is far away from HTK HDecode. I have HDecode HTK WER = 30% while in Julius it is 90% which I think there is something wrong in Julius decoder configuration.

The configurations are: -input mfcfile -filelist filelist/test.scp -nlr test.lm -v test.dic -h hmmsdef.mmf # acoustic HMM (ascii or Julius binary) -hlist tied.list # HMMList to map logical phone to physical -b 10000 -s 20000 # hypotheses stack size on 2nd pass (#hypo) -m 10000 # hypotheses overflow threshold (#hypo) -n 100 # num of sentences to find

I also applied the LM scale factor (10) that I'm using in HTK as follows: -imp 10.0 0.0 the log file as attached. log.txt

If you kindly help tweaking the decoder as I think it should give closed WER to HTK.

Appreciate your response.

Regards

Abdo

palles77 commented 8 years ago

To get Julius properly decoding I suggest you download some examples from GIT from other branches , such as https://github.com/julius-speech/dictation-kit and carefully study its contents. To tune Julius and get it working is quite a lot of work. I have been tuning my results for the last 8 years, and just only recently I get a WER of around 11%. However a good starting point is an example available above. Another good resource is http://voxforge.net

Julius is quite different than HTK - the only similar thing between them is that Julius can read HTK models. Apart of that not too much in common. To get you a good starting point here is config I use:

-input file -filelist models\test.dbl -htkconf models\wav_config -h models\PLPL-Legal-v0.1.am -hlist models\PLPL-Legal-v0.1.phn -d models\PLPL-Legal-v0.1.lm -v models\PLPL-Legal-v0.1.dct -b 2000 -walign -fallback1pass -spmodel sp -multipath -spsegment -norealtime -spmodel sp -gprune none -sepnum 150 -lmp 12 -4 -lmp2 12 -4 -b2 360 -n 40 -s 2000 -m 8000 -lookuprange 5 -sb 80 -forcedict

palles77 commented 8 years ago

You also seem to have quite a few errors in your log

ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004270.mfcc

read analyzed parameter

Error: gzfile: failed to open "D:/Luminous/Test/corpus/output/test/004271.mfcc" ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004271.mfcc

read analyzed parameter

Error: gzfile: failed to open "D:/Luminous/Test/corpus/output/test/004272.mfcc" ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004272.mfcc

read analyzed parameter

Error: gzfile: failed to open "D:/Luminous/Test/corpus/output/test/004273.mfcc" ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004273.mfcc

read analyzed parameter

Error: gzfile: failed to open "D:/Luminous/Test/corpus/output/test/004274.mfcc" ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004274.mfcc

read analyzed parameter

Error: gzfile: failed to open "D:/Luminous/Test/corpus/output/test/004275.mfcc" ERROR: error in reading parameter file: D:/Luminous/Test/corpus/output/test/004275.mfcc

Can it be that some of the files were completely ignored, thus your high WER?

Abdolrahman commented 8 years ago

Hello, The error because i excluded some noisy files from mfcc. I can eliminate this error by removing the mfcc files from the filelist. however, i don't think it has effect on the results.

What do you see?

Abdolrahman commented 8 years ago

Hello palles77, thank you so much for the parameters you sent, it makes much improvement on the level of debugging and word alignment.

I just have another question, i did split the score of AM and LM, i got the following: AM: -38075.484375 LM: -988.138062

i believe that the score are far way each other as i used scale factor 10.0 in htk, however, i couldn't find it as lmp doesn't reflect on these far results. Any ideas?

palles77 commented 8 years ago

I never looked at these AM and LM figures, never could make out what they actually mean. I suppose you could debug Julius in Windows using VS2013 as I am doing normally and see how these values are calculated. In terms of files sent to Julius I would recommend sending WAV files instead of MFC if possible - makes things easier and less prone to errors. If you really want some more help from me you can send me your models and wave files and I can try experimenting with them, however I do not really know your native language, so I might not be much of help anyway.

LeeAkinobu commented 8 years ago

Hello @Abdolrahman ,

The scores of Julius are in log10, whereas the HTK scores are in log. the LM score reflects both LM weight and word penalties.

julius-speech / julius