Closed yyq745201 closed 9 years ago
Mandarin model is not supported by sphinx4 unfortunately.
@nshmyrev same problem and it's 2018 now lol,dose it support Mandarin now?if not,can I train my own acoustic model?
You can train any acoustic model you like. You just need LM, Phonetic Dictionary and Audio.
On Fri, 22 Jun 2018, 09:16 ZeR0ll, notifications@github.com wrote:
@nshmyrev https://github.com/nshmyrev same problem and it's 2018 now lol,dose it support Mandarin now?if not,can I train my own acoustic model?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmusphinx/sphinx4/issues/26#issuecomment-399308214, or mute the thread https://github.com/notifications/unsubscribe-auth/AJ7w0qiu9b5tiEF7zKapCUP8ZgUJJ81pks5t_GGDgaJpZM4D9Dmx .
Mandarin model is not supported by sphinx4,really?
It's a pretty old program, that's also pretty inactive do that's not very surprising.
On Tue, Aug 28, 2018, 01:29 naliazheli notifications@github.com wrote:
Mandarin model is not supported by sphinx4,really?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cmusphinx/sphinx4/issues/26#issuecomment-416456021, or mute the thread https://github.com/notifications/unsubscribe-auth/APwe_HcD44OrT8ikwW4r6k14nTzwhounks5uVNVKgaJpZM4D9Dmx .
I'm really upset.
being upset won't help.
What needs to be done is implement a frontend that can cope with tone and then train acoustic and language models using some data.
This is open-source software and you can fix issues add missing functionality. Or you can whine about missing functionality but that won't change anything. Your choice.
Newly released mandarin model is supported by sphinx4:
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Mandarin/
I've downloaded the model, only to find it still came across the ArrayIndexOutOfBoundsException
when I was trying to align a short Chinese sentence audio with its transcription.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 115
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.growFastmatchBranches(WordPruningBreadthFirstLookaheadSearchManager.java:272)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.fastMatchRecognize(WordPruningBreadthFirstLookaheadSearchManager.java:212)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstLookaheadSearchManager.localStart(WordPruningBreadthFirstLookaheadSearchManager.java:244)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.startRecognition(WordPruningBreadthFirstSearchManager.java:274)
at edu.cmu.sphinx.decoder.Decoder.decode(Decoder.java:62)
at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:106)
at edu.cmu.sphinx.recognizer.Recognizer.recognize(Recognizer.java:122)
at edu.cmu.sphinx.api.SpeechAligner.align(SpeechAligner.java:120)
at edu.cmu.sphinx.api.SpeechAligner.align(SpeechAligner.java:65)
at Aligner.run(Aligner.java:29)
at Main.main(Main.java:5)
Here's my aligner:
import java.io.File;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;
import edu.cmu.sphinx.alignment.LongTextAligner;
import edu.cmu.sphinx.api.SpeechAligner;
import edu.cmu.sphinx.result.WordResult;
public class Aligner {
private static final String ACOUSTIC_MODEL_PATH = "cmusphinx-zh-cn-5.2/zh_cn.cd_cont_5000";
private static final String DICTIONARY_PATH = "cmusphinx-zh-cn-5.2/zh_cn.dic";
private String audioPath, transcriptPath, outPath;
public Aligner(String audioPath, String transcriptionPath, String outPath) {
this.audioPath = audioPath;
this.transcriptPath = transcriptionPath;
this.outPath = outPath;
}
public void run() throws Exception {
SpeechAligner aligner = new SpeechAligner(ACOUSTIC_MODEL_PATH, DICTIONARY_PATH, null);
URL audioUrl = new File(audioPath).toURI().toURL();
Scanner scanner = new Scanner(new File(transcriptPath));
scanner.useDelimiter("\\Z");
String transcript = scanner.next();
List<WordResult> results = aligner.align(audioUrl, transcript);
// List<String> stringResults = new ArrayList<>();
// for (WordResult wr : results) {
// stringResults.add(wr.getWord().getSpelling());
// }
//
// LongTextAligner textAligner = new LongTextAligner(stringResults, 2);
// List<String> sentences = aligner.getTokenizer().expand(transcript);
// List<String> words = aligner.sentenceToWords(sentences);
//
// int[] aid = textAligner.align(words);
// int lastId = -1;
// for (int i = 0; i < aid.length; ++i) {
// if (aid[i] == -1) {
// System.out.format("- %s\n", words.get(i));
// } else {
// if (aid[i] - lastId > 1) {
// for (WordResult result : results.subList(lastId + 1,
// aid[i])) {
// System.out.format("+ %-25s [%s]\n", result.getWord()
// .getSpelling(), result.getTimeFrame());
// }
// }
// System.out.format(" %-25s [%s]\n", results.get(aid[i])
// .getWord().getSpelling(), results.get(aid[i])
// .getTimeFrame());
// lastId = aid[i];
// }
// }
//
// if (lastId >= 0 && results.size() - lastId > 1) {
// for (WordResult result : results.subList(lastId + 1,
// results.size())) {
// System.out.format("+ %-25s [%s]\n", result.getWord()
// .getSpelling(), result.getTimeFrame());
// }
// }
}
}
And the transcription is
家 住 女 贞 路 四 号 的 德 思 礼 夫 妇 总 是 得 意 地 说 他 们 是 非 常 规 矩 的 人 家 拜 托 拜 托 了
I've downloaded the model, only to find it still came across the
ArrayIndexOutOfBoundsException
when I was trying to align a short Chinese sentence audio with its transcription.
This means you are using very old sphinx4 version/jars.
I downloard sphinx4-core-5prealpha.jar
from oss.sonatype.org. Should I try the version 5prealpha-SNAPSHOT
?
I've tried 5prealpha-SNAPSHOT
, it was the same situation.
What is md5 sum of your jars?
MD5 (sphinx4-core-5prealpha.jar) = 53b3765c2ba93f60e193d049a6e110ee MD5 (sphinx4-core-5prealpha-20160628.232526-10.jar) = 53b3765c2ba93f60e193d049a6e110ee
I run into similar problems last week, and I rebuild a jar file using the latest code from github, then everything worked fine. (Also you must use the latest model) Maybe you can try this jar. sphinx4.jar.zip
By the way, sphinx4 treats every line as a sentence, so your transcription is only one sentence.
the demo was running well untill I changed 'English model' to 'Mandarin Language Model' , then I got this error: java.lang.IndexOutOfBoundsException: Index: 71680, Size: 71680 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Pool.get(Pool.java:55) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.createSenonePool(Sphinx3Loader.java:501) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.loadModelFiles(Sphinx3Loader.java:386) at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.load(Sphinx3Loader.java:315) 。。。。。
this is my code: Configuration configuration = new Configuration(); configuration.setAcousticModelPath("file:" + configPath + "\zh\zh"); configuration.setDictionaryPath("file:" + configPath + "\zh\zh_broadcastnews_utf8.dic"); configuration.setLanguageModelPath("file:"+ configPath+ "\zh\zh_broadcastnews_64000_utf8.dmp"); try { StreamSpeechRecognizer streamRecognizer = new StreamSpeechRecognizer(configuration); }catch (Exception ex) { ex.printStackTrace(); }