Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
It should be:
$ mkdir ckpts/speechtokenizer_hubert_avg
and
huggingface-cli download amphion/valle SpeechTokenizer.pt config.json --local-dir ckpts/speechtokenizer_hubert_avg
Some minor questions about the VALLE V2 documentation and the demo.ipynb scripts.
Incorrect path when downloading pre-trained speech tokenizer from HuggingFace.
huggingface-cli download amphion/valle speechtokenizer_hubert_avg/SpeechTokenizer.pt speechtokenizer_hubert_avg/config.json --local-dir ckpts
It should be:
$ mkdir ckpts/speechtokenizer_hubert_avg
andhuggingface-cli download amphion/valle SpeechTokenizer.pt config.json --local-dir ckpts/speechtokenizer_hubert_avg
BTW, the “VALLE” in the example audio path in the demo script should be capitalized. [Amphion/egs/tts/VALLE_V2/demo.ipynb](https://github.com/open-mmlab/Amphion/blob/main/egs/tts/VALLE_V2/demo.ipynb#L134-L136)