Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
# put your cheackpoint file (.bin) in the root path of AmphionVALLEv2
# or use your own pretrained weights
ar_model_path = 'ckpts/valle_ar_mls_196000.bin' # huggingface-cli download amphion/valle valle_ar_mls_196000.bin valle_nar_mls_164000.bin --local-dir ckpts
nar_model_path = 'ckpts/valle_nar_mls_164000.bin'
speechtokenizer_path = 'ckpts/speechtokenizer_hubert_avg'
from models.tts.valle_v2.valle_inference import ValleInference
# change to device='cuda' to use CUDA GPU for fast inference
# change "use_vocos" to True would give better sound quality
# If you meet problem with network, you could set "use_vocos=False", though would give bad quality
model = ValleInference(ar_path=ar_model_path, nar_path=nar_model_path, speechtokenizer_path=speechtokenizer_path, device="cpu")
lead error
[/content/Amphion/models/tts/valle_v2/valle_nar.py](https://localhost:8080/#) in __init__(self, config)
42 def __init__(self, config: LlamaConfig):
43 """Override to adaptive layer norm"""
---> 44 super().__init__(config=config, layer_idx=0) # init attention, mlp, etc.
45 self.input_layernorm = LlamaAdaptiveRMSNorm(
46 config.hidden_size, eps=config.rms_norm_eps, dim_cond=config.hidden_size
TypeError: LlamaDecoderLayer.__init__() got an unexpected keyword argument 'layer_idx'
error when loading vall-e inference
lead error
cc : @jiaqili3