Open sergedahdouh opened 1 year ago
data_svc/waves-16k/ data_svc/whisper
speaker0<<<<<<<<<< (639, 1024) speaker1<<<<<<<<<< Traceback (most recent call last): File "/content/lora-svc/prepare/preprocess_ppg.py", line 54, in pred_ppg(whisper, f"{wavPath}/{spks}/{file}.wav", f"{ppgPath}/{spks}/{file}.ppg") File "/content/lora-svc/prepare/preprocess_ppg.py", line 26, in pred_ppg ppg = whisper.encoder(mel.unsqueeze(0)).squeeze().data.cpu().float().numpy() File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/content/lora-svc/whisper/model.py", line 154, in forward assert len_x <= len_e, "incorrect audio shape" AssertionError: incorrect audio shape
any idea what is the issue speaker0 is my record voice around 11 sec and speaker1 is song which is around 57 sec
4 cut audio, less than 30 seconds for whisper
data_svc/waves-16k/ data_svc/whisper
any idea what is the issue speaker0 is my record voice around 11 sec and speaker1 is song which is around 57 sec