Open mrgloom opened 11 years ago
HDF5 seems interesting. I think it may be considered as a replacement.
Also check this as example http://labrosa.ee.columbia.edu/millionsong/pages/fast-k-nn-using-hdf5
It seems that FLANN can be used with HDF5 http://www.cs.ubc.ca/~mariusm/uploads/FLANN/flann_manual-1.6.pdf
Regard to http://www.tagtraum.com/jipes/index.html This is very simple library (only low performance FFT) and written on Java. Now we use .NET.
As I understood, it is algo for speech recognition. We needed some other kinds of algos.
Regard to HDF5. First, choice of database engine is very dependent from features we will use. Now we are selecting differents alogs of feature extraction. So, now we do not know what kind of features we will use in future. Therefore choise of database engine is not actual. IMHO.
Second, it is sense to change DB engine when current DB does not carry needed performance. But now DB engine has no performance problems. I think we should not to introduce features for the sake of features. http://en.wikipedia.org/wiki/KISS_principle
Thanks, @mrgloom , some interesting features were extracted to the following wiki page: http://github.com/johnnybuggy/HOLO/wiki/External-references-and-useful-information-about-audio-feature-extraction
I have read this post http://habrahabr.ru/post/194724/
Here is my suggestions:
1.Feature extraction. You can use more complicated features like http://www.tagtraum.com/jipes/index.html (also http://www.beatunes.com/ is seems similar to your program) or using http://www.liacs.nl/~dmus/api2011.html 2.DB with knn search. You can use HDF5(or something like pytables) for large data and flann or ann libruary for knn search or even cuda or multiple cpu cores for parallel metric calculation. Also you can use http://en.wikipedia.org/wiki/Vector_quantization or histogram to make features vectors of the same size and to reduce dimensions you can use something like PCA(but it need to be modified for large data).
Also for specify similarity metric (something more smart than euclidean distance or mahalanobis distance) you can specify pairs of songs to input of some classifer which output will be [0 1].