Open Afnanksalal opened 5 months ago
I solved it for me in that way:
split_audio_whisper()
gets additional argument device
get_se()
passes device
information from vc_model
to split_audio_whisper()
Code changes in ../openvoice/se_extractor.py
:
def split_audio_whisper(audio_path, audio_name, target_dir='processed', device='cuda'):
# NEW: device='cuda'
global model
if model is None:
if device == 'cuda': compute_type = 'float16' # NEW
if device == 'cpu': compute_type = 'float32' # NEW
model = WhisperModel(model_size, device=device, compute_type=compute_type) # NEW/modifed
# ...
def get_se(audio_path, vc_model, target_dir='processed', vad=True):
# ...
else:
# NEW/modified: device=device.split(':')[0]
wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name, device=device.split(':')[0])
# ...
It runs quite okay with example texts on laptop CPU, needs ~2 minutes for calculation.
Mike
I'm trying to get this running on my laptop with cpu and whenever I try notebook 3 and get to the 'Obtain Color Code Embedding' I always get:
RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version"
Did you guys run into this running on your CPU?
Yes I have the same issue and all the modification are done.
By default OpenVoice uses CUDA for voice tone extraction which is a pain in the ass for CPU runtimes. i can modify the SRC of the se_extractor.py to use a CUDA/CPU switcher to check for drivers but again its pain in the ass to modify the internal dependency code everytime when i start a fresh new codebase. it will be good if yall can add that to the repo so that when i do a new fresh install i dont need to modify the src everytime!
thank you!
just add a model argument to the function so that we can choose cpu or gpu with different float values for ie my instance uses float32...