georgid / AlignmentDuration

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
http://mtg.upf.edu/node/3751
GNU Affero General Public License v3.0
56 stars 6 forks source link

Implement viterbi in c++ #58

Open georgid opened 6 years ago

georgid commented 6 years ago

With essentia’s viterbi https://github.com/MTG/essentia/issues/253 or

georgid commented 6 years ago

Before starting, check the readme until the Lyrics class

I. Reimplement essential methods:

  1. Reimplement creation of a transition probability matrix. It needs as parameter the statesNetwork which is a list of class StateWithDur, from which the transiion matrix needs only a few methods like getWaitProb() Maybe it is best to keep _HMM as wrapper calling the C++ functions, in which these StateWithDur-specific logic is called. This way there is no need to reimplement StateWithDur in C++.

  2. Reimplement the calculation of observation likelihoods matrix src.hmm.continuous._ContinuousHMM._ContinuousHMM._mapB . Note that the method pdfAllfeatures is implemented in MLPHMM and it calls theano using the DNN model. So maybe the obs. matrix should remain in python and find a way to combine the numpy array with the transmatrix in C++ ???

  3. Reimplement the calculation of forced Viterbi algorithm here. You can also check the method InitDecodingParameters that precomputes the observation matrix.

In fact 1. and 3. mean reimplmentation of all methods in class _HMM except viterbi_fast() and visualize_trans_probs(). Is it possible to do a class HMM in C++, without changing the class hierarchy in python: MLPHMM needs _HMM as parent: see this diagram

  1. Reimplement the backtracking

II. Do integration test

georgid commented 6 years ago

Step 3. done in the fork