Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Address fs2 preprocessing exception due to mismatched package versions
Refactor MFA file structure into /pretrained and modify control logic in fetching and managing MFA files
🚧 Related Issues
Issue #118, error in training from LJSpeech dataset, arising from outdated file paths for lexicon dictionary.
Issue #119, due to package manager installing librosa 0.10+, which deprecated audioread support (See librosa 0.10 documentation)
Further testing identified additional environment issues arising from newer dependency versions causing errors. After isolation testing, diffsptk is confirmed to cause the error. More specifically, the dependency vector-quantize-pytorch. The package manager fetches vector_quantize_pytorch 1.12.16 by default (notice the underscores instead of hyphens in the package name) instead of vector-quantize-pytorch. Therefore, the version should also be specified for this package to be no more than 1.12.5 for now until a change in package manager behavior.
👨💻 Changes Proposed
[x] Modifying directories in preprocessors/ljspeech.py to change the saving directory for MFA files from the root folder to /pretrained to comply with file management convention for Amphion system; additionally, link lexicon directory to provided lexicon file to avoid redundant files.
[x] Rewrite prepare_mfa.sh and modify run.sh for a more robust logic in fetching and managing MFA files; removing the redundant section for downloading LJSpeech lexicon.
[x] Modifying env.sh to specify librosa version to 0.9.1 and vector-quantize-pytorch to 1.12.5
🧑🤝🧑 Who Can Review?
@lmxue @RMSnow
🛠 TODO
Potential Consideration: A few issues seem to arise from preparing environments, one consideration could be to freeze versions for certain packages for better stability @RMSnow.
✅ Checklist
[x] Code has been reviewed
[x] Code complies with the project's code standards and best practices
[x] Code has passed all tests
[x] Code does not affect the normal use of existing features
[x] Code has been commented properly
[x] Documentation has been updated (if applicable)
[x] Demo/checkpoint has been attached (if applicable)
✨ Description
/pretrained
and modify control logic in fetching and managing MFA files🚧 Related Issues
Issue #118, error in training from LJSpeech dataset, arising from outdated file paths for lexicon dictionary.
Issue #119, due to package manager installing librosa 0.10+, which deprecated audioread support (See librosa 0.10 documentation)
Further testing identified additional environment issues arising from newer dependency versions causing errors. After isolation testing,
diffsptk
is confirmed to cause the error. More specifically, the dependencyvector-quantize-pytorch
. The package manager fetchesvector_quantize_pytorch
1.12.16 by default (notice the underscores instead of hyphens in the package name) instead ofvector-quantize-pytorch
. Therefore, the version should also be specified for this package to be no more than 1.12.5 for now until a change in package manager behavior.👨💻 Changes Proposed
preprocessors/ljspeech.py
to change the saving directory for MFA files from the root folder to/pretrained
to comply with file management convention for Amphion system; additionally, link lexicon directory to provided lexicon file to avoid redundant files.prepare_mfa.sh
and modifyrun.sh
for a more robust logic in fetching and managing MFA files; removing the redundant section for downloading LJSpeech lexicon.librosa
version to 0.9.1 andvector-quantize-pytorch
to 1.12.5🧑🤝🧑 Who Can Review?
@lmxue @RMSnow
🛠 TODO
Potential Consideration: A few issues seem to arise from preparing environments, one consideration could be to freeze versions for certain packages for better stability @RMSnow.
✅ Checklist