Open Legoless opened 11 years ago
Check for sampling frequency, I got the same error when I encoded it with 8000 hz sampling frequency, but got it proper when encoded the same with 16000 hz sampling frequency.
HOLA QUE TAL ME GUSTRIA SABER EN EL MFCC PROCESSOR CUAL ES LA VARIABLE O EL VALOR QUE REPRESENTA LANTIDAD DE ENERGIA IMPLICITA EN LA VOZ EN EL MOMENTO DE HACER EL RECONOCIMIENTO
I'm using your MFCC processor in a custom project and some songs return inf and NaN float values for some frames.
More information: I am using a custom audio file loader and run MFCC processor in a single run - all audio samples are processed in a single buffer, not buffer by buffer when the samples are read. Only a single channel float format is sent to MFCC processor, values between -1.0 and 1.0 of course.
In some cases the songs I use to test first few frames return NaN (the rest of them are fine apparently). That causes the mean operation to place NaN values across whole vector.
Also the AVAsset reader apparently does some preprocessing itself (mixing, normalizing or something), do you have any information regarding that? Because audio loading system I use is built on Extended Audio Services and sample values are different than sample values obtained from AVAssetReader. This leads to MFCC be quite different if they are calculated using AVAsset reader source or Extended Audio Services (I am actually using TheAmazingAudioEngine audio file loading operation).