Open razvanab opened 4 months ago
Hi, did you manage to fix it?
I tried, but I gave up, unfortunately.
You need to open se_extractor.py and replace line 22 with
model = WhisperModel(model_size, device="cpu", compute_type="float32")
then restart the kernel
I get this error when I execute demo_part3.ipynb in Jupiter.
My GPU is a GTX 1060 6GB VM.
ValueError Traceback (most recent call last) Cell In[3], line 2 1 reference_speaker = 'resources/example_reference.mp3' # This is the voice you want to clone ----> 2 target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
File ~/dev/OpenVoice/openvoice/se_extractor.py:146, in get_se(audio_path, vc_model, target_dir, vad) 144 wavs_folder = split_audio_vad(audio_path, target_dir=target_dir, audio_name=audio_name) 145 else: --> 146 wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name) 148 audio_segs = glob(f'{wavs_folder}/*.wav') 149 if len(audio_segs) == 0:
File ~/dev/OpenVoice/openvoice/se_extractor.py:22, in split_audio_whisper(audio_path, audio_name, target_dir) 20 global model 21 if model is None: ---> 22 model = WhisperModel(model_size, device="cuda", compute_type="float16") 23 audio = AudioSegment.from_file(audio_path) 24 max_len = len(audio)
File ~/.local/lib/python3.10/site-packages/faster_whisper/transcribe.py:128, in WhisperModel.init(self, model_size_or_path, device, device_index, compute_type, cpu_threads, num_workers, download_root, local_files_only) 121 else: 122 model_path = download_model( 123 model_size_or_path, 124 local_files_only=local_files_only, 125 cache_dir=download_root, 126 ) --> 128 self.model = ctranslate2.models.Whisper( 129 model_path, 130 device=device, 131 device_index=device_index, 132 compute_type=compute_type, 133 intra_threads=cpu_threads, 134 inter_threads=num_workers, 135 ) 137 tokenizer_file = os.path.join(model_path, "tokenizer.json") 138 if os.path.isfile(tokenizer_file):
ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.