lkuza2 / java-speech-api

The J.A.R.V.I.S. Speech API is designed to be simple and efficient, using the speech engines created by Google to provide functionality for parts of the API. Essentially, it is an API written in Java, including a recognizer, synthesizer, and a microphone capture utility. The project uses Google services for the synthesizer and recognizer. While this requires an Internet connection, it provides a complete, modern, and fully functional speech API in Java.
GNU General Public License v3.0
531 stars 304 forks source link

calculateEnergy #98

Open nalbion opened 6 years ago

nalbion commented 6 years ago

This VAD algorithm suggests to calculate the energy of each frame. ...Is that the same as RMS?

This code of Sciss/SpeechRecognitionHMM seems to be using a different algorithm:

        public double[] calcEnergy(float[][] framedSignal) {
        double[] energyValue = new double[framedSignal.length];
        for (int i = 0; i < framedSignal.length; i++) {
            float sum = 0;
            for (int j = 0; j < samplePerFrame; j++) {
                // sum the square
                sum += Math.pow(framedSignal[i][j], 2);
            }
            // find log
            energyValue[i] = Math.log(sum);
        }
        return energyValue;
    }

MicrophoneAnalyser:

        public static int calculateRMSLevel(byte[] audioData){
        long lSum = 0;
        for(int i=0; i<audioData.length; i++)
            lSum = lSum + audioData[i];

        double dAvg = lSum / audioData.length;

        double sumMeanSquare = 0d;
        for(int j=0; j<audioData.length; j++)
            sumMeanSquare = sumMeanSquare + Math.pow(audioData[j] - dAvg, 2d);

        double averageMeanSquare = sumMeanSquare / audioData.length;
        return (int)(Math.pow(averageMeanSquare,0.5d) + 0.5);
    }