georgid / AlignmentDuration

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.
http://mtg.upf.edu/node/3751
GNU Affero General Public License v3.0
56 stars 6 forks source link

reduce dependecy on htkmfc #44

Open georgid opened 7 years ago

georgid commented 7 years ago

make sure extracting MFCC with essentia same as damp model:

add preempahsis (or recreate model without preemphasis ) add cepstral mean normalization

georgid commented 6 years ago

emphasis reproduced here , but still there is an error in the differernce between mfccs. The difference can be seen by running this code

Seems for acapella the difference leads to minor decrease in performance, but when source separation is applied, it is critically worse.