aubio / aubio

a library for audio and music analysis
https://aubio.org
GNU General Public License v3.0
3.29k stars 376 forks source link

Mirroring TensorFlow features #235

Open oreganrob opened 5 years ago

oreganrob commented 5 years ago

Hi,

I'm currently using the Aubio IOS library in a CoreML POC I'm building in Swift. The model I'm using is a TensorFlow model based off the speech_commands example they ship with the TF source code.

The TF speech_commands example has a wav_to_features python script that can be used to test the TF model...

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/speech_commands/wav_to_features.py

I can also use the extracted TF features, hard coded, in my app and they work fine with my CoreML model. What I am now trying to do is use Aubio to extract similar audio features to those produced by TF. I've got the code working in principle based on a couple of posts such as...

https://github.com/aubio/aubio/issues/121

and I can extract features that are similar but not the same. Where I am struggling is with the configuration of the hop size, win size etc...

The TF features have the following structure...

/* File automatically created by

And I've tried the following...

let hop_size : uint_t = 160 let win_s:uint_t = 480 let n_filters:uint_t = 40 let n_coefs:uint_t = 26 let samplerate:uint_t = 16000

But the features aren't quite the same. I'm wondering if you might be able to point me in the correct direction?

Thanks!

piem commented 5 years ago

hi @Fatlog

I'm preparing an update which should help addressing this topic, it will take a bit more time,

thanks for your patience, piem

oreganrob commented 5 years ago

wow, great! Thanks piem. I was mainly wondering if it was even possible but this is even better :)