Plachtaa / seed-vc

State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
GNU General Public License v3.0
546 stars 60 forks source link

请教为什么报错,cannot reshape tensor of 0 elements into shape [-1, 0] #37

Open mahu168 opened 1 week ago

mahu168 commented 1 week ago

命令: python inference.py --source /root/autodl-tmp/test/test_vocals.wav --target /root/autodl-tmp/test/dingzhen_0.wav --output /root/autodl-tmp/test/ --diffusion-steps 50 --length-adjust 1.0 --inference-cfg-rate 0.7 --f0-condition True --auto-f0-adjust False --semi-tone-shift 0

报错: Warning: Skipped loading some keys due to shape mismatch: {'estimator.input_pos'} cfm loaded length_regulator loaded Loading weights from nvidia/bigvgan_v2_44khz_128band_512x Removing weight norm... Traceback (most recent call last): File "/root/autodl-tmp/seed-vc/inference.py", line 278, in main(args) File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/root/autodl-tmp/seed-vc/inference.py", line 144, in main ref_waves_16k = torchaudio.functional.resample(ref_audio, sr, 16000) File "/root/miniconda3/lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1530, in resample resampled = _apply_sinc_resample_kernel(waveform, orig_freq, new_freq, gcd, kernel, width) File "/root/miniconda3/lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1462, in _apply_sinc_resample_kernel waveform = waveform.view(-1, shape[-1]) RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

补充: --source 换官方的歌曲没问题,用自己的就不行

CoreBedtime commented 1 week ago

same

Plachtaa commented 1 week ago

可能是双声道的问题,你的source可以发一下吗

mahu168 commented 1 week ago

可能是双声道的问题,你的source可以发一下吗

可以的 test_vocals.zip

Plachtaa commented 1 week ago

看了一下是因为source超过了30秒导致的,目前建议长音频先用app.py推理,之后会把分段处理的逻辑同步到inference.py