PlayVoice / lora-svc

singing voice change based on whisper, and lora for singing voice clone
MIT License
630 stars 78 forks source link

incorrect audio shape #87

Open sergedahdouh opened 1 year ago

sergedahdouh commented 1 year ago

data_svc/waves-16k/ data_svc/whisper

speaker0<<<<<<<<<< (639, 1024) speaker1<<<<<<<<<< Traceback (most recent call last): File "/content/lora-svc/prepare/preprocess_ppg.py", line 54, in pred_ppg(whisper, f"{wavPath}/{spks}/{file}.wav", f"{ppgPath}/{spks}/{file}.ppg") File "/content/lora-svc/prepare/preprocess_ppg.py", line 26, in pred_ppg ppg = whisper.encoder(mel.unsqueeze(0)).squeeze().data.cpu().float().numpy() File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/content/lora-svc/whisper/model.py", line 154, in forward assert len_x <= len_e, "incorrect audio shape" AssertionError: incorrect audio shape

any idea what is the issue speaker0 is my record voice around 11 sec and speaker1 is song which is around 57 sec

MaxMax2016 commented 1 year ago

4 cut audio, less than 30 seconds for whisper