Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
it raises when the input x is in shape torch.Size([1, 256, 583]).
for V2, prosody encoder's input is mel, while other encoder's input are still waveform.
some padding/cutting is needed to ensure the two outputs have the same length.
Exception has occurred: RuntimeError (note: full exception trace is shown but execution is paused at: _run_module_as_main)
The size of tensor a (582) must match the size of tensor b (583) at non-singleton dimension 2
File "/home/chenjiasheng/code/amphion/Amphion/models/codec/ns3_codec/facodec.py", line 1048, in quantize
outs += out
File "/home/chenjiasheng/code/amphion/Amphion/models/codec/ns3_codec/facodec.py", line 1086, in forward
outs, qs, commit_loss, quantized_buf = self.quantize(
File "/mfa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/mfa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chenjiasheng/code/amphion/test_facodec_v2.py", line 58, in <module>
vq_post_emb_b, vq_id_b, _, quantized, spk_embs_b = fa_decoder_v2(
File "/mfa_env/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/mfa_env/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
RuntimeError: The size of tensor a (582) must match the size of tensor b (583) at non-singleton dimension 2
https://github.com/open-mmlab/Amphion/blob/58dc8707dec735fdb381d351fc123bec9242b204/models/codec/ns3_codec/facodec.py#L1048
it raises when the input
x
is in shapetorch.Size([1, 256, 583])
.for V2, prosody encoder's input is mel, while other encoder's input are still waveform. some padding/cutting is needed to ensure the two outputs have the same length.