i know the issue is the data not in the same device, i have changed serveral times,it did not work.
Model was trained with pyannote.audio 0.0.1, yours is 3.0.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.1.0+cu118. Bad things might happen unless you revert torch to 1.x.
device cuda
onnx load done
onnx load done
Processing: 0%| | 0/18 [00:00<?, ?it/s]Traceback (most recent call last):
File "/mnt/sda/github/11yue/HeyGenClone/translate_fxy.py", line 35, in <module>
translate(
File "/mnt/sda/github/11yue/HeyGenClone/translate_fxy.py", line 12, in translate
engine(video_filename, output_filename)
File "/mnt/sda/github/11yue/HeyGenClone/core/engine.py", line 60, in __call__
dereverb_out = self.dereverb.split(original_audio_file)
File "/mnt/sda/github/11yue/HeyGenClone/core/dereverb.py", line 233, in split
return self.pred.prediction(input)
File "/mnt/sda/github/11yue/HeyGenClone/core/dereverb.py", line 201, in prediction
sources = self.demix(mix.T)
File "/mnt/sda/github/11yue/HeyGenClone/core/dereverb.py", line 126, in demix
sources = self.demix_base(segmented_mix, margin_size=margin)
File "/mnt/sda/github/11yue/HeyGenClone/core/dereverb.py", line 168, in demix_base
tar_waves = model.istft(torch.tensor(spec_pred))
File "/mnt/sda/github/11yue/HeyGenClone/core/dereverb.py", line 60, in istft
x = torch.cat([x, freq_pad], -2)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument tensors in method wrapper_CUDA_cat)
Processing: 0%| | 0/18 [00:04<?, ?it/s]
i know the issue is the data not in the same device, i have changed serveral times,it did not work.