CPU support for voice tone extraction

Afnanksalal commented 5 months ago

By default OpenVoice uses CUDA for voice tone extraction which is a pain in the ass for CPU runtimes. i can modify the SRC of the se_extractor.py to use a CUDA/CPU switcher to check for drivers but again its pain in the ass to modify the internal dependency code everytime when i start a fresh new codebase. it will be good if yall can add that to the repo so that when i do a new fresh install i dont need to modify the src everytime!

thank you!

model = None
def split_audio_whisper(audio_path, audio_name, target_dir='processed'):
    global model
    if model is None:
        model = WhisperModel(model_size, device="cuda", compute_type="float16")

just add a model argument to the function so that we can choose cpu or gpu with different float values for ie my instance uses float32...

hengstenberg commented 2 months ago

I solved it for me in that way:

split_audio_whisper() gets additional argument device
get_se() passes device information from vc_model to split_audio_whisper()

Code changes in ../openvoice/se_extractor.py:

def split_audio_whisper(audio_path, audio_name, target_dir='processed', device='cuda'):
# NEW: device='cuda'
    global model
    if model is None:
        if device == 'cuda': compute_type = 'float16'   # NEW
        if device == 'cpu': compute_type = 'float32'    # NEW
        model = WhisperModel(model_size, device=device, compute_type=compute_type)   # NEW/modifed
    # ...

def get_se(audio_path, vc_model, target_dir='processed', vad=True):
    # ...
    else:
        # NEW/modified: device=device.split(':')[0]
        wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name, device=device.split(':')[0])
    # ...

It runs quite okay with example texts on laptop CPU, needs ~2 minutes for calculation.

Mike

nmcbride commented 1 month ago

I'm trying to get this running on my laptop with cpu and whenever I try notebook 3 and get to the 'Obtain Color Code Embedding' I always get:

RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version"

Did you guys run into this running on your CPU?

mambari commented 1 month ago

Yes I have the same issue and all the modification are done.

myshell-ai / OpenVoice

CPU support for voice tone extraction #261