JorenSix / TarsosDSP

A Real-Time Audio Processing Framework in Java
http://0110.be/tag/TarsosDSP
GNU General Public License v3.0
1.97k stars 472 forks source link

Tarsos MFCC android #110

Open andriiaveiro opened 7 years ago

andriiaveiro commented 7 years ago
 @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

        AudioDispatcher dispatcher = AudioDispatcherFactory.fromDefaultMicrophone(22050,1024,0);
        final MFCC mfcc =  new MFCC(1024, 44100,40,50,300,3000);
        dispatcher.addAudioProcessor(mfcc);
        dispatcher.addAudioProcessor(new PitchProcessor(PitchProcessor.PitchEstimationAlgorithm.FFT_YIN, 22050, 1024, new PitchDetectionHandler() {
            @Override
            public void handlePitch(PitchDetectionResult pitchDetectionResult, AudioEvent audioEvent) {
                final float pitchInHz = pitchDetectionResult.getPitch();
                final float[] data= mfcc.getMFCC();
                final String[] dataString = new String[data.length];

                for (int i = 0; i < data.length ; i++) {
                    dataString[i] = " " + data[i];
                }

                runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        if(pitchInHz != -1)
                        {
                            ArrayAdapter adapter = new ArrayAdapter(MainActivity.this, R.layout.listview, dataString);
                            ListView listview =(ListView) findViewById(R.id.listview);
                            listview.setAdapter(adapter);
                        }
                    }
                });
            }
        }));
        new Thread(dispatcher,"Audio Dispatcher").start();
    }

what is the correct way to feature extraction using MFCC on android?

domagalla commented 7 years ago

Hey I have the same problem. Are there any solutions?

szuyumi commented 6 years ago

me too

juberrahman commented 5 years ago

is there any example application for getting mffc features in android using Taros DSP?

cxy200927099 commented 5 years ago

I get the mfcc values by Tarosdsp lib are not same as the values by python librosa lib here is my code, android code

private void testMFCC() {

        int sampleRate = 44100;
        //set parameters, And these are same as the python librosa library
        //window size
        int bufferSize = 2048;
        //the step of two frame
        int bufferOverlap = bufferSize-512;
        int fmin = 30;
        int fmax = 3000;//(int) (sampleRate*0.5);
        int n_cep_mel = 40;
        int n_mels = 128;
        InputStream is = null;
        try {
            is = getAssets().open("flute.novib.ff.A4.pcm");

            AudioDispatcher dispatcher = new AudioDispatcher(
                    new UniversalAudioInputStream(
                            is,
                            new TarsosDSPAudioFormat(
                                    sampleRate, 16, 1, true, true)
                    ),
                    bufferSize,
                    bufferOverlap);
            final MFCC mfcc = new MFCC(bufferSize, sampleRate, n_cep_mel, n_mels, fmin,
                    fmax);
            dispatcher.addAudioProcessor(mfcc);
            dispatcher.addAudioProcessor(new AudioProcessor() {

                @Override
                public void processingFinished() {
                    Log.d("CXY", "finish");

                }

                @Override
                public boolean process(AudioEvent audioEvent) {
                    //fetchng MFCC array and removing the 0th index because its energy coefficient and florian asked to discard
                    float[] mfccOutput = mfcc.getMFCC();
                    mfccOutput = Arrays.copyOfRange(mfccOutput, 0,
                            mfccOutput.length);

                    //Storing in global arraylist so that i can easily transform it into csv
                    mfccList.add(mfccOutput);
                    Log.i("CXY", String.valueOf(Arrays.toString(mfccOutput)));
//                    Log.d("cxy", "mfccs len="+mfccs.length);
                    return true;
                }
            });
            new Thread(dispatcher, "mfcc-dispatcher").start();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

python code

def testMFCC3():
    # file = '../shinian/shinian_qingchang.wav'
    file = 'flute.novib.ff.A4.wav'
    y, sr = librosa.load(file, sr=None)
    print("y_lenth=", len(y), " sampleRate=", sr)
    # extract mfcc
    mfccs = librosa.feature.mfcc(y=y,
                                 sr=sr,
                                 n_mfcc=40,
                                 n_fft=2048,
                                 hop_length=512,
                                 win_length=512,
                                 center=False,
                                 fmin=30,
                                 fmax=3000,
                                 window='hamming',
                                 # norm=None
                                 )
    print(mfccs[:, 0])
    print(mfccs[:, -1])
    print(mfccs.shape)

the mfcc parameter are same but in android the mfcc values are

[-149.56438, -35.85402, -41.923664, -22.674006, -2.042098, -25.56679, 6.3226676, 4.8530135, 45.501385, 23.00089, -12.665383, -50.659378, -18.775757, 47.545193, -3.471268, -13.6165495, 32.04868, 16.516235, -31.286652, -5.070638, -27.113258, 17.594528, 37.89841, -10.320859, 0.921544, -9.809169, 18.431198, -29.35976, -6.7678466, 13.006239, 33.242813, -22.717735, 18.488544, -18.583925, 0.33961812, 10.058762, 1.9633503, -15.240587, 23.60292, -9.608361]
[-150.98102, -37.915485, -50.66328, -24.781683, 1.3317629, -26.495079, 9.527364, 2.0870152, 48.291397, 12.637501, -6.5011125, -53.482674, -21.084702, 45.460655, -6.1428323, -14.372066, 28.77685, 14.771567, -37.446434, -4.906693, -33.51369, 15.080782, 40.761192, -16.877008, 4.2132025, -12.34782, 15.462919, -28.442005, -6.7560997, 9.3322115, 31.36475, -21.490568, 13.731277, -18.151527, -3.3237329, 5.6522703, 0.5766791, -17.184029, 20.927132, -7.8601766]
[-149.98598, -28.697165, -36.478195, -13.117099, 3.7823079, -35.78266, -1.2555473, 0.88154703, 45.496964, 16.27332, -12.327629, -52.5228, -21.626474, 47.26068, -3.9854593, -17.237074, 34.41523, 12.851591, -34.756172, -4.7409325, -32.593098, 16.971743, 39.860386, -14.635867, 2.2286997, -12.534999, 15.114595, -30.587063, -5.3999805, 9.145244, 33.82593, -21.490015, 15.82396, -19.866163, -2.6191306, 7.699687, 3.2308872, -16.700987, 24.147278, -10.168923]
[-161.39189, -44.11063, -58.580605, -31.302572, -5.818132, -25.09144, 14.345338, -0.9662001, 41.255276, 23.569016, -12.303108, -50.020203, -18.208387, 49.14071, 0.87544966, -12.295719, 36.297127, 17.202654, -32.457478, -1.1665286, -28.237682, 21.16371, 39.04848, -8.577308, 3.7998104, -7.5350323, 19.713837, -30.038479, -2.483957, 10.610859, 31.036594, -19.965364, 17.221785, -15.596262, -1.5734043, 8.746574, 1.7624863, -14.77393, 24.147917, -10.111036]
[-162.89716, -36.87191, -57.368073, -32.390793, -2.3577838, -17.914364, 23.210873, -0.37889734, 48.096287, 25.43931, -4.555753, -49.216537, -11.788088, 47.430347, 4.5558267, -9.664559, 33.707397, 22.717014, -36.12172, -0.863914, -26.60997, 17.467697, 40.332497, -14.446778, 4.9023438, -9.105244, 17.285015, -32.74773, -6.138797, 11.614723, 27.31714, -22.627996, 17.424278, -16.844452, -2.7403166, 10.114424, -0.9374727, -13.568806, 25.236536, -10.849961]

in python, the mfcc data shap is (40, 5) and the mfcc values are

root@d6544d5f00fa:/home/cxy/work/project/music-recognize/code# python3 calc_mfcc.py
y_lenth= 4491  sampleRate= 44100
[-279.21338      18.10227     -78.32788     -19.317728    -20.23542
  -29.225216     -7.920616    -56.20508     -20.31695      48.62169
   97.28233      11.99969     -56.3453       47.673824    -12.512078
  -23.52428      12.715927     16.653606    -30.053783     15.577892
  -44.088417     30.554619     -9.12875       1.024126      3.4306574
   12.599848      3.194224     -1.3946589     9.500994     -5.031226
    0.31443775   -2.6793687     0.9130297    -3.9500008     0.80598986
  -12.3641815    -2.0231156    -5.3813486     3.7782857     5.5124817 ]
[-2.8046588e+02  1.1381366e+01 -7.3924751e+01 -1.8136002e+01
 -1.5018967e+01 -2.9644073e+01 -1.7752523e+01 -5.4225811e+01
 -1.9963863e+01  4.9801498e+01  1.0370607e+02  1.2392213e+01
 -5.6271660e+01  4.6125435e+01 -1.1001463e+01 -1.9814741e+01
  2.0823637e+01  1.6906593e+01 -2.9464365e+01  1.8187700e+01
 -4.5187607e+01  2.9246365e+01 -4.7338562e+00  1.0907283e+00
 -1.1592598e+00  1.3183573e+01 -9.5845109e-01 -2.0085907e-01
  9.3784466e+00 -7.1362739e+00  1.1961896e+00 -7.8641915e-01
 -2.9033558e+00 -6.7620349e-01  5.7919264e+00 -1.1360594e+01
 -2.0949235e+00 -1.9059618e+00  4.7447605e+00  6.7331047e+00]
(40, 5)

any one can help ?