Subtitle-Synchronizer / jlibrosa

Librosa equivalent Java library to process audio file adn extract features from it.
MIT License
89 stars 26 forks source link

Librosa equivalence #9

Open skoffas opened 2 years ago

skoffas commented 2 years ago

Hi @VVasanth and @abhi-rawat1,

This is a very nice library and thank you for your work.

I am trying to generate the MFCCs for test.wav.zip but jlibrosa's results are different from python's librosa implementation (0.8.1). In particular, the array's shape is (40, 100) in python and (40, 101) in java. Additionally, the values are different. The details about the file are the following (generated with soxi)

Input File     : 'test.wav'
Channels       : 1
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:01.00 = 44100 samples = 75 CDDA sectors
File Size      : 88.2k
Bit Rate       : 706k
Sample Encoding: 16-bit Signed Integer PCM

The java code that I use is

import java.io.IOException;
import com.jlibrosa.audio.JLibrosa;
import com.jlibrosa.audio.exception.FileFormatNotSupportedException;
import com.jlibrosa.audio.wavFile.WavFileException;

class testJlibrosa {

    private static int SAMPLE_RATE = 44100;
    private static int N_MFCC = 40;
    private static int N_FFT = 1103;
    private static int L_HOP = 441;
    // Default value according to https://librosa.org/doc/main/generated/librosa.filters.mel.html#librosa.filters.mel
    private static int N_MELS = 128;

    public static void main(String args[]){
        String filename = "test.wav";
        JLibrosa jlibrosa = new JLibrosa();
        try {
            float featureValues[] = jlibrosa.loadAndRead(filename, SAMPLE_RATE, -1);
            float[][] mfccs = jlibrosa.generateMFCCFeatures(featureValues, SAMPLE_RATE, N_MFCC,
                                                            N_FFT, N_MELS, L_HOP);                                      
        } catch (IOException e) {
            System.out.println("The wav file does not exist.");
        } catch (WavFileException e) {
            System.out.println("Something went wrong witht the wav file");
        } catch (FileFormatNotSupportedException e) {
            System.out.println("The file format is not supported.");
        }
    }
}

In python the same constants are used

signal, sr = librosa.load("test.wav", sr=None)
mfccs = librosa.feature.mfcc(signal, sr, n_mfcc=40, n_fft=1103, hop_length=441, n_mels=128)

I am using python 3.6.9, librosa 0.8.1 and openjdk version "11.0.11" 2021-04-20. Any help would be appreciated.

Thanks in advance

Edit: Removed unused variable from java example