[BUG]: the lengths of the features after FACodecEncoderV2 is not match

bug of FACodecEncoderV2

I have extracted prosody_feature and encoder_output from FACodecEncoderV2. It raise wrong when I use fa_decoder_v2 to extract vq codecs becaucse the lengths of prosody_feature(torch.Size([1, 20, 281])) and encoder_output(torch.Size([1, 256, 282])) is not same.

my code

wav_b = librosa.load(wav_b, sr=16000)[0] wav_b = torch.from_numpy(wav_b).float() wav_b = wav_b.unsqueeze(0).unsqueeze(0) enc_out_b = fa_encoder_v2(wav_b) prosody_b = fa_encoder_v2.get_prosody_feature(wav_b) vq_post_emb_b, vq_idb, , quantized, spk_embs_b = fa_decoder_v2( enc_out_b, prosody_b, eval_vq=False, vq=True )

bug

File "/home/data/mahaotian/Amphion/models/codec/ns3_codec/inference_codc.py", line 129, in vq_post_emb_a, vq_ida, , quantized, spk_embs_a = fa_decoder_v2( File "/home/data/mahaotian/anaconda3/envs/vallex/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/data/mahaotian/anaconda3/envs/vallex/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(args, **kwargs) File "/home/data/mahaotian/Amphion/models/codec/ns3_codec/facodec.py", line 1086, in forward outs, qs, commit_loss, quantized_buf = self.quantize( File "/home/data/mahaotian/Amphion/models/codec/ns3_codec/facodec.py", line 1048, in quantize outs += out RuntimeError: The size of tensor a (281) must match the size of tensor b (282) at non-singleton dimension 2

open-mmlab / Amphion

[BUG]: the lengths of the features after FACodecEncoderV2 is not match #188

bug of FACodecEncoderV2

my code

bug