jeffheaton / encog-java-core

http://www.heatonresearch.com/encog
Other
744 stars 268 forks source link

How do I use the encog data formats for audio classification with hidden markov models? #204

Open LSayn opened 9 years ago

LSayn commented 9 years ago

I am trying to classify audio data with MFCC and an Hidden Markov Model. I am using the encog java library.

I have a working MFCC implementation and get 13 coefficients out of it.

I can't get it to work probably, I think my biggest problem is the understanding of the data format encog uses.

For training I have an array that holds all MFCC-vectors of the training data plus an array for information for supervised learning for each vector. Right now, I am simply tossing this in an BasicMlSequenceSet.

This "works", in terms of it doesn't throw an exception.

Classification works similiar, i put the MFCC vector in an BasicMLDataSet and call the probability-method of the hmm. It throws an ArrayIndexOutOfBoundsException here.

What am I doing wrong? I know I should normalize the features before using them, but how? Should I use a different data format here?

Sorry for the vagueness, but I am really stuck here and can't seem to get it to work. I am a little bit desperate here, so if anyone could help me out a little bit I would be more than thankful.

LSayn commented 9 years ago

In case that I am wrong here, I have posted the same question on Stackoverflow: http://stackoverflow.com/questions/28801389/how-do-i-use-the-encog-data-formats-for-audio-classification-with-hidden-markov

ghost commented 9 years ago

LSayn, have you had any luck with this yet? Im starting some toy-programs with audio and encog.

LSayn commented 9 years ago

@andrelopes1705 Nope. I haven't. I used Jahmm (just google it) and added the few things I needed.

jeffheaton commented 6 years ago

Encog does not have any support for reading audio directly. However I will add an example of directly building a dataset of this type from scratch.

true-meowmeow commented 11 months ago

Encog does not have any support for reading audio directly. However I will add an example of directly building a dataset of this type from scratch.

I am currently working with sound and have some difficulties. Is it possible to get a link to this example?