georgid / AlignmentDuration

Lyrics-to-audio-alignement system. Based on Machine Learning Algorithms: Hidden Markov Models with Viterbi forced alignment. The alignment is explicitly aware of durations of musical notes. The phonetic model are classified with MLP Deep Neural Network.

http://mtg.upf.edu/node/3751

GNU Affero General Public License v3.0

56 stars 6 forks source link

remove hard-coded logic for discriminating btw two duration distributions #13

Closed georgid closed 9 years ago

georgid commented 9 years ago

Either in StateWithDur or Phoneme or _DurationHMM (best here) put field which duration distrib. to use.

BACKGROUND: For exponential disrib. there is no duration, but there is transprob. and only middle state instead of three. (because I dont know how to implement the 3 states with exponential distrib.)

PROBLEM: 1) hard coded sil on first and last phoneme has only middle state in LyricsWithModels.LyricsWithModels._linkToModels

2) hard coded on first and last state there is expo distrib in hmm.continuous._DurationHMM._DurationHMM.getWaitLogLik

3) set distrib type in Lyrics.Lyrics._words2Phonemes

georgid commented 9 years ago

as well HARD CODED in LyricsWithModels.LyricsWithModels.duration2numFrameDuration

georgid commented 9 years ago

as well hard coded in Decoder.decodeAudio()

transMatrix for 0 state which is silence

        transMatrix = self.lyricsWithModels.phonemesNetwork[0].getTransMatrix()
        self.hmmNetwork.setWaitProbSilState(transMatrix[2,2])

georgid commented 9 years ago

remove setDurationInMinUnit from Phoenems class. Phoenmes is assigned directly a durInFrames

rewrite its only reference (from Lyrcis._words2Phonemes) in a more clever way: phonemeSil.setDurationInMinUnit('1')