myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.
https://research.myshell.ai/open-voice
MIT License
29.65k stars 2.91k forks source link

libcudnn error. #225

Open arthurwolf opened 6 months ago

arthurwolf commented 6 months ago

I followed the provided instructions.

I turned the demo_part3 file into a normal python file to test the code:


# Import necessary libraries
import os
import sys
import torch
from openvoice import se_extractor
from openvoice.api import ToneColorConverter
from melo.api import TTS

# Constants
ckpt_converter = 'checkpoints_v2/converter'
device = "cuda:0" if torch.cuda.is_available() else "cpu"
output_dir = 'outputs_v2'

# Create output directory if it does not exist
os.makedirs(output_dir, exist_ok=True)

# Initialize Tone Color Converter
tone_color_converter = ToneColorConverter(f'{ckpt_converter}/config.json', device=device)
tone_color_converter.load_ckpt(f'{ckpt_converter}/checkpoint.pth')

# Extract tone color embedding for the target speaker
reference_speaker = 'resources/example_reference.mp3'
target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)

# Texts for various languages
texts = {
    'EN_NEWEST': "Did you ever hear a folk tale about a giant turtle?",
    'EN': "Did you ever hear a folk tale about a giant turtle?",
    'ES': "El resplandor del sol acaricia las olas, pintando el cielo con una paleta deslumbrante.",
    'FR': "La lueur dorée du soleil caresse les vagues, peignant le ciel d'une palette éblouissante.",
    'ZH': "在这次vacation中,我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。",
    'JP': "彼は毎朝ジョギングをして体を健康に保っています。",
    'KR': "안녕하세요! 오늘은 날씨가 정말 좋네요.",
}

# Output path for temporary audio file
src_path = f'{output_dir}/tmp.wav'
speed = 1.0  # Speed is adjustable

print("Processing TTS...")

# Process each language and text
for language, text in texts.items():

    print(f"Processing {language}...")

    model = TTS(language=language, device=device)
    speaker_ids = model.hps.data.spk2id

    for speaker_key in speaker_ids.keys():
        speaker_id = speaker_ids[speaker_key]
        speaker_key = speaker_key.lower().replace('_', '-')

        # Load source speaker embedding
        source_se = torch.load(f'checkpoints_v2/base_speakers/ses/{speaker_key}.pth', map_location=device)

        # Generate speech and save to temporary file
        model.tts_to_file(text, speaker_id, src_path, speed=speed)
        save_path = f'{output_dir}/output_v2_{speaker_key}.wav'

        # Convert tone color
        encode_message = "@MyShell"
        tone_color_converter.convert(
            audio_src_path=src_path,
            src_se=source_se,
            tgt_se=target_se,
            output_path=save_path,
            message=encode_message)

# Print completion message
print("TTS processing complete. Check the outputs in:", output_dir)

When I run it I get:


(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-07 - 23:45:23
╰─(openvoice) ⠠⠵ python tts.py                                                                                                                                                        on main|…8
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Loaded checkpoint 'checkpoints_v2/converter/checkpoint.pth'
missing/unexpected keys: [] []
OpenVoice version: v2
Could not load library libcudnn_cnn_infer.so.8. Error: /lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8: undefined symbol: _ZN5cudnn14cublasSaxpy_v2EP13cublasContextiPKfS3_iPfi, version libcudnn_ops_infer.so.8
Please make sure libcudnn_cnn_infer.so.8 is in your library path!
[1]    2253274 IOT instruction (core dumped)  python tts.py
(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-07 - 23:45:35
╰─(openvoice) ⠠⠵     

What am I doing wrong?

This is ubuntu 23.04, and when I ran into this error I did:

sudo apt install libcudnn9-static-cuda-12    
sudo apt install libcudnn8 libcudnn8-dev   

But it didn't help.

I have CUDA and everything installed, I run dozens of different CUDA/Pythorch/AI related projects on this machine including trying out most of the TTS stuff available on github.

Any help very welcome.

Thank you.

HarewVlad commented 6 months ago

Try to do the following:

export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
arthurwolf commented 5 months ago

It worked, though I also had to install libcublass. I'll check the documentation again to see if maybe I missed some kind of step.

(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-10 - 13:52:06
╰─(openvoice) ⠠⠵ export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-10 - 13:52:43
╰─(openvoice) ⠠⠵ python tts.py                                                                                                                                                        on main|…8
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Loaded checkpoint 'checkpoints_v2/converter/checkpoint.pth'
missing/unexpected keys: [] []
OpenVoice version: v2
Traceback (most recent call last):
  File "/home/arthur/dev/ai/OpenVoice/tts.py", line 23, in <module>
    target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
  File "/home/arthur/dev/ai/OpenVoice/openvoice/se_extractor.py", line 146, in get_se
    wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name)
  File "/home/arthur/dev/ai/OpenVoice/openvoice/se_extractor.py", line 28, in split_audio_whisper
    segments, info = model.transcribe(audio_path, beam_size=5, word_timestamps=True)
  File "/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/faster_whisper/transcribe.py", line 308, in transcribe
    encoder_output = self.encode(segment)
  File "/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/faster_whisper/transcribe.py", line 610, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.11 is not found or cannot be loaded
(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-10 - 13:52:59
╰─(openvoice) ⠠⠵ sudo apt install libcublas11                                                                                                                                         on main|…8
[sudo] password for arthur: 
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libcublaslt11
The following NEW packages will be installed:
  libcublas11 libcublaslt11
0 upgraded, 2 newly installed, 0 to remove and 11 not upgraded.
Need to get 259 MB of archives.
After this operation, 670 MB of additional disk space will be used.
Do you want to continue? [Y/n] 
Get:1 http://fr.archive.ubuntu.com/ubuntu lunar/multiverse amd64 libcublaslt11 amd64 11.11.3.6~11.8.0-3 [212 MB]
Get:2 http://fr.archive.ubuntu.com/ubuntu lunar/multiverse amd64 libcublas11 amd64 11.11.3.6~11.8.0-3 [46,7 MB]
Fetched 259 MB in 7s (39,1 MB/s)                                                                                                                                                                
Selecting previously unselected package libcublaslt11:amd64.
(Reading database ... 721648 files and directories currently installed.)
Preparing to unpack .../libcublaslt11_11.11.3.6~11.8.0-3_amd64.deb ...
Unpacking libcublaslt11:amd64 (11.11.3.6~11.8.0-3) ...
Selecting previously unselected package libcublas11:amd64.
Preparing to unpack .../libcublas11_11.11.3.6~11.8.0-3_amd64.deb ...
Unpacking libcublas11:amd64 (11.11.3.6~11.8.0-3) ...
Setting up libcublaslt11:amd64 (11.11.3.6~11.8.0-3) ...
Setting up libcublas11:amd64 (11.11.3.6~11.8.0-3) ...
Processing triggers for libc-bin (2.37-0ubuntu2.2) ...
(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-10 - 13:53:51
╰─(openvoice) ⠠⠵ python tts.py                                                                                                                                                        on main|…8
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Loaded checkpoint 'checkpoints_v2/converter/checkpoint.pth'
missing/unexpected keys: [] []
OpenVoice version: v2
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/functional.py:665: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error.
Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at ../aten/src/ATen/native/SpectralOps.cpp:873.)
  return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore[attr-defined]
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv2d(input, weight, bias, self.stride,
Processing TTS...
Processing EN_NEWEST...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.41k/3.41k [00:00<00:00, 1.64MB/s]
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:01<00:00, 111MB/s]
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
  0%|                                                                                                                                                                      | 0/1 [00:00<?, ?it/s]Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/modules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv1d(input, weight, bias, self.stride,
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00,  6.09s/it]
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv2d(input, weight, bias, self.stride,
Processing EN...
/home/arthur/.anaconda3/envs/openvoice/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.28it/s]
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  6.72it/s]
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 10.68it/s]
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.00it/s]
 > Text split to sentences.
Did you ever hear a folk tale about a giant turtle?
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  7.32it/s]
Processing ES...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.43k/3.43k [00:00<00:00, 4.08MB/s]
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:02<00:00, 102MB/s]
 > Text split to sentences.
El resplandor del sol acaricia las olas, pintando el cielo con una paleta deslumbrante.
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.17s/it]
Processing FR...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.40k/3.40k [00:00<00:00, 3.82MB/s]
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:01<00:00, 108MB/s]
 > Text split to sentences.
La lueur dorée du soleil caresse les vagues, peignant le ciel d'une palette éblouissante.
 > ===========================
Downloading pytorch_model.bin: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 445M/445M [00:03<00:00, 112MB/s]
Some weights of the model checkpoint at dbmdz/bert-base-french-europeana-cased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']MB/s]
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:06<00:00,  6.34s/it]
Processing ZH...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.30k/2.30k [00:00<00:00, 5.02MB/s]
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:02<00:00, 102MB/s]
 > Text split to sentences.
在这次vacation中,
我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景.
 > ===========================
  0%|                                                                                                                                                                      | 0/2 [00:00<?, ?it/s]Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.477 seconds.
Prefix dict has been built successfully.
Downloading pytorch_model.bin: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 672M/672M [00:06<00:00, 108MB/s]
Some weights of the model checkpoint at bert-base-multilingual-uncased were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']:00, 108MB/s]
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:10<00:00,  5.01s/it]
Processing JP...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.43k/3.43k [00:00<00:00, 5.00MB/s]
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:02<00:00, 102MB/s]
 > Text split to sentences.
彼は毎朝ジョギングをして体を健康に保っています.
 > ===========================
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  1.84it/s]
Processing KR...
Downloading config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3.40k/3.40k [00:00<00:00, 1.33MB/s]
Downloading checkpoint.pth: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 208M/208M [00:01<00:00, 113MB/s]
 > Text split to sentences.
안녕하세요! 오늘은 날씨가 정말 좋네요.
 > ===========================
  0%|                                                                                                                                                                      | 0/1 [00:00<?, ?it/s]you have to install python-mecab-ko. install it...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Collecting python-mecab-ko
  Downloading python_mecab_ko-1.3.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting python-mecab-ko-dic (from python-mecab-ko)
  Downloading python_mecab_ko_dic-2.1.1.post2-py3-none-any.whl.metadata (1.4 kB)
Downloading python_mecab_ko-1.3.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (578 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 578.4/578.4 kB 9.6 MB/s eta 0:00:00
Downloading python_mecab_ko_dic-2.1.1.post2-py3-none-any.whl (34.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 34.5/34.5 MB 9.9 MB/s eta 0:00:00
Installing collected packages: python-mecab-ko-dic, python-mecab-ko
Successfully installed python-mecab-ko-1.3.5 python-mecab-ko-dic-2.1.1.post2
Downloading pytorch_model.bin: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 476M/476M [00:04<00:00, 109MB/s]
Some weights of the model checkpoint at kykim/bert-kor-base were not used when initializing BertForMaskedLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']M [00:04<00:00, 114MB/s]
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:24<00:00, 24.42s/it]
TTS processing complete. Check the outputs in: outputs_v2
(openvoice) ╭─arthur at aquarelle in ~/dev/ai/OpenVoice on main✘✘✘ 24-05-10 - 13:55:22
╰─(openvoice) ⠠⠵ ls     

thanks a lot !

lazzarello commented 5 months ago

Updating the library path to include the python virtual env works but speaker embeddings depend on nvidia-cublas-cu11 and will break with version 12. Error output is

Traceback (most recent call last):
  File "/home/lee/src/OpenVoice/example.py", line 16, in <module>
    target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 146, in get_se
    wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 28, in split_audio_whisper
    segments, info = model.transcribe(audio_path, beam_size=5, word_timestamps=True)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 308, in transcribe
    encoder_output = self.encode(segment)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 610, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.11 is not found or cannot be loaded

I was lucky enough to have a copy of that shared library elsewhere and manually added it to the LD_LIBRARY_PATH.

xiangzy999 commented 5 months ago

Updating the library path to include the python virtual env works but speaker embeddings depend on nvidia-cublas-cu11 and will break with version 12. Error output is

Traceback (most recent call last):
  File "/home/lee/src/OpenVoice/example.py", line 16, in <module>
    target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 146, in get_se
    wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 28, in split_audio_whisper
    segments, info = model.transcribe(audio_path, beam_size=5, word_timestamps=True)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 308, in transcribe
    encoder_output = self.encode(segment)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 610, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.11 is not found or cannot be loaded

I was lucky enough to have a copy of that shared library elsewhere and manually added it to the LD_LIBRARY_PATH.

I have encountered this problem. Can you tell me how to handle it? Thank you!

vladlearns commented 5 months ago

Made a pull, solving all of this in Docker: https://github.com/myshell-ai/OpenVoice/pull/264 Also, check https://github.com/myshell-ai/OpenVoice/issues/215#issuecomment-2153388034 for local windows fix

LordAlex2015 commented 4 months ago

Updating the library path to include the python virtual env works but speaker embeddings depend on nvidia-cublas-cu11 and will break with version 12. Error output is

Traceback (most recent call last):
  File "/home/lee/src/OpenVoice/example.py", line 16, in <module>
    target_se, audio_name = se_extractor.get_se(reference_speaker, tone_color_converter, vad=False)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 146, in get_se
    wavs_folder = split_audio_whisper(audio_path, target_dir=target_dir, audio_name=audio_name)
  File "/home/lee/src/OpenVoice/openvoice/se_extractor.py", line 28, in split_audio_whisper
    segments, info = model.transcribe(audio_path, beam_size=5, word_timestamps=True)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 308, in transcribe
    encoder_output = self.encode(segment)
  File "/home/lee/src/OpenVoice/.venv/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 610, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: Library libcublas.so.11 is not found or cannot be loaded

I was lucky enough to have a copy of that shared library elsewhere and manually added it to the LD_LIBRARY_PATH.

I have encountered this problem. Can you tell me how to handle it? Thank you!

Fixed it with sudo apt install libcublas11

I saw it in https://github.com/myshell-ai/OpenVoice/issues/225#issuecomment-2104636968