cmusphinx / sphinx4

Pure Java speech recognition library
cmusphinx.sourceforge.net
Other
1.4k stars 587 forks source link

the error about Acoustic Model #26

Closed yyq745201 closed 9 years ago

yyq745201 commented 9 years ago

the demo was running well untill I changed 'English model' to 'Mandarin Language Model' , then I got this error: java.lang.IndexOutOfBoundsException: Index: 71680, Size: 71680 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool.get(Pool.java:55) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.createSenonePool(Sphinx3Loader.java:501) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.loadModelFiles(Sphinx3Loader.java:386) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.load(Sphinx3Loader.java:315) 。。。。。

this is my code: Configuration configuration = new Configuration(); configuration.setAcousticModelPath("file:" + configPath + "\zh\zh"); configuration.setDictionaryPath("file:" + configPath + "\zh\zh_broadcastnews_utf8.dic"); configuration.setLanguageModelPath("file:"+ configPath+ "\zh\zh_broadcastnews_64000_utf8.dmp"); try { StreamSpeechRecognizer streamRecognizer = new StreamSpeechRecognizer(configuration); }catch (Exception ex) { ex.printStackTrace(); }

nshmyrev commented 9 years ago

Mandarin model is not supported by sphinx4 unfortunately.

ZeR0ll commented 6 years ago

@nshmyrev same problem and it's 2018 now lol,dose it support Mandarin now?if not,can I train my own acoustic model?

nibircse commented 6 years ago

You can train any acoustic model you like. You just need LM, Phonetic Dictionary and Audio.

On Fri, 22 Jun 2018, 09:16 ZeR0ll, notifications@github.com wrote:

@nshmyrev https://github.com/nshmyrev same problem and it's 2018 now lol,dose it support Mandarin now?if not,can I train my own acoustic model?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmusphinx/sphinx4/issues/26#issuecomment-399308214, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ7w0qiu9b5tiEF7zKapCUP8ZgUJJ81pks5t_GGDgaJpZM4D9Dmx .

naliazheli commented 6 years ago

Mandarin model is not supported by sphinx4,really?

canonbob51 commented 6 years ago

It's a pretty old program, that's also pretty inactive do that's not very surprising.

On Tue, Aug 28, 2018, 01:29 naliazheli notifications@github.com wrote:

Mandarin model is not supported by sphinx4,really?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmusphinx/sphinx4/issues/26#issuecomment-416456021, or mute the thread https://github.com/notifications/unsubscribe-auth/APwe_HcD44OrT8ikwW4r6k14nTzwhounks5uVNVKgaJpZM4D9Dmx .

lmxyy commented 5 years ago

I'm really upset.

timobaumann commented 5 years ago

being upset won't help.

What needs to be done is implement a frontend that can cope with tone and then train acoustic and language models using some data.

This is open-source software and you can fix issues add missing functionality. Or you can whine about missing functionality but that won't change anything. Your choice.

nshmyrev commented 5 years ago

Newly released mandarin model is supported by sphinx4:

https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/

lmxyy commented 5 years ago

I've downloaded the model, only to find it still came across the ArrayIndexOutOfBoundsException when I was trying to align a short Chinese sentence audio with its transcription.

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 115
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.growFastmatchBranches(WordPruningBreadthFirstLookaheadSearchManager.java:272)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.fastMatchRecognize(WordPruningBreadthFirstLookaheadSearchManager.java:212)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.localStart(WordPruningBreadthFirstLookaheadSearchManager.java:244)
    at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.startRecognition(WordPruningBreadthFirstSearchManager.java:274)
    at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:62)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:106)
    at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:122)
    at edu.cmu.sphinx.api.SpeechAligner.align(SpeechAligner.java:120)
    at edu.cmu.sphinx.api.SpeechAligner.align(SpeechAligner.java:65)
    at Aligner.run(Aligner.java:29)
    at Main.main(Main.java:5)

Here's my aligner:

import java.io.File;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

import edu.cmu.sphinx.alignment.LongTextAligner;
import edu.cmu.sphinx.api.SpeechAligner;
import edu.cmu.sphinx.result.WordResult;

public class Aligner {
    private static final String ACOUSTIC_MODEL_PATH = "cmusphinx-zh-cn-5.2/zh_cn.cd_cont_5000";
    private static final String DICTIONARY_PATH = "cmusphinx-zh-cn-5.2/zh_cn.dic";
    private String audioPath, transcriptPath, outPath;

    public Aligner(String audioPath, String transcriptionPath, String outPath) {
        this.audioPath = audioPath;
        this.transcriptPath = transcriptionPath;
        this.outPath = outPath;
    }

    public void run() throws Exception {
        SpeechAligner aligner = new SpeechAligner(ACOUSTIC_MODEL_PATH, DICTIONARY_PATH, null);
        URL audioUrl = new File(audioPath).toURI().toURL();
        Scanner scanner = new Scanner(new File(transcriptPath));
        scanner.useDelimiter("\\Z");
        String transcript = scanner.next();

        List<WordResult> results = aligner.align(audioUrl, transcript);
//        List<String> stringResults = new ArrayList<>();
//        for (WordResult wr : results) {
//            stringResults.add(wr.getWord().getSpelling());
//        }
//
//        LongTextAligner textAligner = new LongTextAligner(stringResults, 2);
//        List<String> sentences = aligner.getTokenizer().expand(transcript);
//        List<String> words = aligner.sentenceToWords(sentences);
//
//        int[] aid = textAligner.align(words);
//        int lastId = -1;
//        for (int i = 0; i < aid.length; ++i) {
//            if (aid[i] == -1) {
//                System.out.format("- %s\n", words.get(i));
//            } else {
//                if (aid[i] - lastId > 1) {
//                    for (WordResult result : results.subList(lastId + 1,
//                            aid[i])) {
//                        System.out.format("+ %-25s [%s]\n", result.getWord()
//                                .getSpelling(), result.getTimeFrame());
//                    }
//                }
//                System.out.format("  %-25s [%s]\n", results.get(aid[i])
//                        .getWord().getSpelling(), results.get(aid[i])
//                        .getTimeFrame());
//                lastId = aid[i];
//            }
//        }
//
//        if (lastId >= 0 && results.size() - lastId > 1) {
//            for (WordResult result : results.subList(lastId + 1,
//                    results.size())) {
//                System.out.format("+ %-25s [%s]\n", result.getWord()
//                        .getSpelling(), result.getTimeFrame());
//            }
//        }
    }
}

And the transcription is

家 住 女 贞 路 四 号 的 德 思 礼 夫 妇 总 是 得 意 地 说 他 们 是 非 常 规 矩 的 人 家 拜 托 拜 托 了
nshmyrev commented 5 years ago

I've downloaded the model, only to find it still came across the ArrayIndexOutOfBoundsException when I was trying to align a short Chinese sentence audio with its transcription.

This means you are using very old sphinx4 version/jars.

lmxyy commented 5 years ago

I downloard sphinx4-core-5prealpha.jar from oss.sonatype.org. Should I try the version 5prealpha-SNAPSHOT?

lmxyy commented 5 years ago

I've tried 5prealpha-SNAPSHOT, it was the same situation.

nshmyrev commented 5 years ago

What is md5 sum of your jars?

lmxyy commented 5 years ago

MD5 (sphinx4-core-5prealpha.jar) = 53b3765c2ba93f60e193d049a6e110ee MD5 (sphinx4-core-5prealpha-20160628.232526-10.jar) = 53b3765c2ba93f60e193d049a6e110ee

acely commented 5 years ago

I run into similar problems last week, and I rebuild a jar file using the latest code from github, then everything worked fine. (Also you must use the latest model) Maybe you can try this jar. sphinx4.jar.zip

By the way, sphinx4 treats every line as a sentence, so your transcription is only one sentence.