RVC-Project / Retrieval-based-Voice-Conversion

in preparation...
MIT License
240 stars 37 forks source link

UVR not working: same output as the input and freezes on using CPU #29

Open alcoftTAO opened 3 months ago

alcoftTAO commented 3 months ago

Hello again! I have one issue with UVR. My code is:

from pathlib import Path
from scipy.io import wavfile
from rvc.modules.uvr5.vr import AudioPreprocess
import os
import sys
import platform

currentCWD = os.getcwd()
path = sys.prefix
system = platform.system().lower()

if (system == "windows"):
    path = path + "\\Lib\\site-packages"
else:
    pythonVersion = sys.version_info
    path = path + "/lib/python" + str(pythonVersion[0]) + "." + str(pythonVersion[1]) + "/site-packages"

os.chdir(path)

os.environ["TEMP"] = currentCWD
os.environ["weight_uvr5_root"] = currentCWD + "/uvr_assets"

model_path: str = "9_HP2-UVR.pth"
audio_path: str = "audio.wav"
agg: int = 10
uvr: AudioPreprocess = AudioPreprocess(model_path, agg, False)

uvr.config.use_cuda()
uvr.model.to("cuda")

print("Model loaded!")

inst, vocals, sr, _ = uvr.process(music_file = currentCWD + "/" + audio_path)
os.chdir(currentCWD)

wavfile.write("vocals.wav", sr, vocals)
wavfile.write("inst.wav", sr, inst)

print("Done!")

The output is:

Model loaded!
  0%|                                                    | 0/19 [00:00<?, ?it/s]/home/alcoft/Projects/Tests_I4.0/LibI4/Python_AI/.env/lib/python3.12/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv2d(input, weight, bias, self.stride,
100%|███████████████████████████████████████████| 19/19 [00:01<00:00, 10.89it/s]
Done!

And the audio output (both vocal and instrumental) are the same as the input.

Running python -c "import torch; print(torch.backends.cudnn.is_available())" prints True. Also, when trying to use the CPU for the inference the code freezes here:

Model loaded!
  0%|                                                    | 0/19 [00:00<?, ?it/s]

When this happens, the code does not use my CPU at all. The code does not print any error message.

My CPU it's not a very good CPU, but it should be enough for inference. My GPU is a NVIDIA RTX 3050 and my OS is Arch Linux.

I have cuda, cudnn and nvidia drivers installed on my OS. My Python version is Python 3.12.3

The UVR model I'm using is 9_HP2-UVR.pth.

If the problem is related with the UVR model I'm using, please recommend one that works.

codecooker1 commented 3 months ago

How much Ram do you have?

I had a kind of similar problem and downgrading to python 3.10.4 fixed. It seems that one of the required library faiss is not ported to python 3.12 yet .

alcoftTAO commented 3 months ago

How much Ram do you have?

I have 16 GB of RAM.

I had a kind of similar problem and downgrading to python 3.10.4 fixed.

Thanks for the help, I would like to downgrade to 3.10.4, but I'm using RVC and UVR as a library on another project and the downgrade may cause some errors, I will keep this issue open for now.

Tps-F commented 2 months ago

It seems to be happening even if it is not the CPU. I'll fix it.

ybwai commented 1 month ago

same issue on python 3.10.14 using MPS

rsxdalv commented 1 month ago

It's a complicated issue. I reviewed the code and it looks almost identical to the code that is working just fine from the original RVC project. It will likely affect anyone, regardless of python 3.10.14 or any other versions.

On Wed, Jul 17, 2024, 1:51 PM ybwai @.***> wrote:

same issue on python 3.10.14

— Reply to this email directly, view it on GitHub https://github.com/RVC-Project/Retrieval-based-Voice-Conversion/issues/29#issuecomment-2233013128, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTRXI5TG73DTHQ6CSSU2PTZMZEBRAVCNFSM6AAAAABIAIGDCOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZTGAYTGMJSHA . You are receiving this because you are subscribed to this thread.Message ID: <RVC-Project/Retrieval-based-Voice-Conversion/issues/29/2233013128@ github.com>