length mismatch for FACodecDecoderV2

https://github.com/open-mmlab/Amphion/blob/58dc8707dec735fdb381d351fc123bec9242b204/models/codec/ns3_codec/facodec.py#L1048

it raises when the input x is in shape torch.Size([1, 256, 583]).

for V2, prosody encoder's input is mel, while other encoder's input are still waveform. some padding/cutting is needed to ensure the two outputs have the same length.

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
The size of tensor a (582) must match the size of tensor b (583) at non-singleton dimension 2
  File "/home/chenjiasheng/code/amphion/Amphion/models/codec/ns3_codec/facodec.py", line 1048, in quantize
    outs += out
  File "/home/chenjiasheng/code/amphion/Amphion/models/codec/ns3_codec/facodec.py", line 1086, in forward
    outs, qs, commit_loss, quantized_buf = self.quantize(
  File "/mfa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mfa_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/chenjiasheng/code/amphion/test_facodec_v2.py", line 58, in <module>
    vq_post_emb_b, vq_id_b, _, quantized, spk_embs_b = fa_decoder_v2(
  File "/mfa_env/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/mfa_env/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
RuntimeError: The size of tensor a (582) must match the size of tensor b (583) at non-singleton dimension 2

open-mmlab / Amphion

length mismatch for FACodecDecoderV2 #160