Tool for Aligning lyrics to audio automatically using a phonetic recognizer with Hidden Markov Models. The Viterbi Decoding with explicit durations of reference syllables can be toggled on with the parameter WITH_DURATIONS
Built from scratch. Alternatively one can use this tool as a wrapper around htk (may be faster) by setting the parameter DECODE_WITH_HTK
If you are using this work please cite http://mtg.upf.edu/node/3751
NOTE: A version building upon this research is built by Voice Magix. It features
If interested in using it write to info at voicemagix dot com
Copyright 2014-2017 Music Technology Group - Universitat Pompeu Fabra
AlignmentDuration is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation (FSF), either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/
For more details see COPYING.txt
BUILD INSTRUCTIONS
NOTE: python3 is not supported and tested
git clone https://github.com/georgid/AlignmentDuration.git; sudo apt-get install python-dev python-setuptools python-numpy
pip install -r requirements; python setup.py install
pdnn needed if OBS_MODEL is MLP or MLP_fuzzy
cd ..; git clone https://github.com/yajiemiao/pdnn
install also Theano
install htk needed if either MFCC_HTK or DECODE_WITH_HTK is set to 1
htkModelParser needed if on Turkish makam and OBS_MODEL is GMM
git clone https://github.com/georgid/htkModelParser.git; cd htkModelParser; sudo pip install -r requirements; python setup.py install
git clone https://github.com/georgid/scikit-learn; sudo apt-get install python-scipy; python setup.py install
git clone https://github.com/georgid/makam_acapella needed if using MLP_fuzzy model
Evaluation (optional if evaluation of accuracy needed)
cd ..; git clone https://github.com/georgid/AlignmentEvaluation.git
Georgi Dzhambazov, Knowledge-based Probabilistic Modeling for Tracking Lyrics in Music Audio Signals, PhD thesis thesis materials companion page
python AlignmentDuration/jingju/runWithParamsAll.py 2 0 /JingjuSingingAnnotation-master/lyrics2audio/results/3folds/ 3 0
to test:
python AlignmentDuration/test/testLyricsAlign.py
with method testLyricsAlign_mandarin_pop
You need to provide the musicbrainz ID (MBID) of the recording. This requirement could be removed on demand...
install https://github.com/MTG/pycompmusic; python pycompmusic/compmusic/extractors/makam/lyricsalign.py
or locally:
python https://github.com/georgid/AlignmentDuration/blob/noteOnsets/src/for_makam/lyricalign_local.py
to test:
python AlignmentDuration/test/testLyricsAlign.py
with method testLyricsAlignMakam
Write to georgi.dzhambazov at upf dot edu or info at voicemagix dot com if you would like to use the English language model. It is not included here for licensing issues.
Use evalAccuracy script. 100 means perfect alignment. Usually values above 80% are acceptably well for human listeners.
The default evaluation level is set at word boundaries
git clone https://github.com/georgid/AlignmentDuration.git git checkout for_pycompmusic
cd /homedtic/georgid/test2/AlignmentDuration source /homedtic/georgid/env/bin/activate python setup.py install
to test: python /homedtic/georgid/test2/AlignmentDuration/test/testLyricsAlign.py
on server: git pull https://github.com/MTG/pycompmusic /srv/dunya/env/src/pycompmusic/compmusic/extractors/makam/lyricsalign.py with recording MB-ID: 727cff89-392f-4d15-926d-63b2697d7f3f