libcufft.so.10: cannot open shared object file: No such file or directory

ByerRA commented 1 month ago

I have a NVIDIA RTX 4000 Ada with 20G RAM, CUDA installed and working

I created a CONDA enviroment with Python v3.10 and executed the following...

pip install -r requirements.txt pip uninstall onnxruntime onnxruntime-gpu pip install onnxruntime-gpu==1.16.3

And when I execute...

python run.py --execution-provider cuda

I get the following errors...

$ python run.py --execution-provider cuda 2024-08-13 13:20:05.140879087 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /home/rbyer/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 2024-08-13 13:20:06.126188628 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /home/rbyer/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 2024-08-13 13:20:06.187616026 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /home/rbyer/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 2024-08-13 13:20:06.280897680 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /home/rbyer/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 2024-08-13 13:20:06.491423828 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: /home/rbyer/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (640, 640) 2024-08-13 13:20:08.047064751 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1193 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcufft.so.10: cannot open shared object file: No such file or directory

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} inswapper-shape: [1, 3, 128, 128]

I have the following "libcufft.so" files installed...

/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libcufft.so
/usr/local/cuda-12.6/targets/x86_64-linux/lib/libcufft.so
/usr/local/cuda-12.6/targets/x86_64-linux/lib/libcufft.so.11
/usr/local/cuda-12.6/targets/x86_64-linux/lib/libcufft.so.11.2.6.28

voorhs commented 1 month ago

You have an incompatible CUDA version. There is onnx 1.16.3 in the dependencies, which requires CUDA 11.8 (source).

To solve this, either downgrade your CUDA or run this app in docker container like nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

voorhs commented 1 month ago

I managed to make it work on my Ubuntu 24.04 system with docker and nvidia-container-toolkit.

Dockerfile

FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3 python3-pip python3-tk ffmpeg libsm6 libxext6

WORKDIR /app

RUN python3 -m pip install --upgrade pip

COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

COPY . /app

EXPOSE 80

ENV DISPLAY=:0

CMD ["python3", "run.py", "--execution-provider", "cuda"]

.dockerignore

models
faces

Bash:

docker build -t deep-live-cam .
xhost +
docker run \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -v ./models:/app/models \
    -v ./faces:/app/faces \
    --device=/dev/video0 \
    --gpus all \
    --rm \
    deep-live-cam

Image is built in 15-20 minutes

ByerRA commented 1 month ago

You have an incompatible CUDA version. There is onnx 1.16.3 in the dependencies, which requires CUDA 11.8 (source).

To solve this, either downgrade your CUDA or run this app in docker container like nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

Thanks, that's what I figured the issue was.

I normally run everything docker containers to avoid situations like this but I also like to flesh out any issues with a package using conda before taking the time to create a container.

ByerRA commented 1 month ago

I managed to make it work on my Ubuntu 24.04 system with docker and nvidia-container-toolkit.

Dockerfile

FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3 python3-pip python3-tk ffmpeg libsm6 libxext6

WORKDIR /app

RUN python3 -m pip install --upgrade pip

COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

COPY . /app

EXPOSE 80

ENV DISPLAY=:0

CMD ["python3", "run.py", "--execution-provider", "cuda"]

.dockerignore

models
faces

Bash:

docker build -t deep-live-cam .
xhost +
docker run \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -v ./models:/app/models \
    -v ./faces:/app/faces \
    --device=/dev/video0 \
    --gpus all \
    --rm \
    deep-live-cam

Image is built in 15-20 minutes

Thanks, that's just about how I was going to go about it and this should be included in the main code for those of use that use docker.

I would also like to note that it would be a good idea to put the version of CUDA required for the package in the main document so that issues like this will be worked out quicker.

voorhs commented 1 month ago

Adding the container to main code might be tricky because of tkinter. The way GUI is "tunnelled" from container to host machine is host-specific. Here I used unix X server. I have seen that MacOS requires some quartz

ByerRA commented 1 month ago

True, but just having a "use at your own risk" docker build file is better than nothing.

donceykong commented 1 month ago

I managed to make it work on my Ubuntu 24.04 system with docker and nvidia-container-toolkit. Dockerfile
FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3 python3-pip python3-tk ffmpeg libsm6 libxext6

WORKDIR /app

RUN python3 -m pip install --upgrade pip

COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

COPY . /app

EXPOSE 80

ENV DISPLAY=:0

CMD ["python3", "run.py", "--execution-provider", "cuda"]
.dockerignore
models
faces
Bash:
docker build -t deep-live-cam .
xhost +
docker run \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -v ./models:/app/models \
    -v ./faces:/app/faces \
    --device=/dev/video0 \
    --gpus all \
    --rm \
    deep-live-cam
Image is built in 15-20 minutes
Thanks, that's just about how I was going to go about it and this should be included in the main code for those of use that use docker.

I would also like to note that it would be a good idea to put the version of CUDA required for the package in the main document so that issues like this will be worked out quicker.

The required CUDA version is noted in the main README

cvaisnor commented 4 weeks ago

I managed to make it work on my Ubuntu 24.04 system with docker and nvidia-container-toolkit.

Dockerfile

FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y python3 python3-pip python3-tk ffmpeg libsm6 libxext6

WORKDIR /app

RUN python3 -m pip install --upgrade pip

COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

COPY . /app

EXPOSE 80

ENV DISPLAY=:0

CMD ["python3", "run.py", "--execution-provider", "cuda"]

.dockerignore

models
faces

Bash:

docker build -t deep-live-cam .
xhost +
docker run \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -v ./models:/app/models \
    -v ./faces:/app/faces \
    --device=/dev/video0 \
    --gpus all \
    --rm \
    deep-live-cam

Image is built in 15-20 minutes

Got this working but there are two issues.

1) Make sure to update L#17 requirements.txt to

onnxruntime-gpu==1.16.3; sys_platform != 'darwin'

It ran without that but the live face swap was glitching a bit as seen in Issue #278 Downgrading to 1.16.3 fixes it.

2) The second change is to the command for running the docker image. You need to add a volume mount point, '/root/.insightface'. Then in the host directory, you need to add a folder to bind that with. In my case, my directory has a 'root/.insightface' empty directory. This will download a .zip called 'buffalo_l.zip' that is then extracted to that folder. This is required if you don't want to download the .zip every time you start the image.

Therefore, the new run image command should be:

docker run \
    -e DISPLAY=$DISPLAY \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -v ./models:/app/models \
    -v ./faces:/app/faces \
    -v ./root/.insightface:/root/.insightface \
    --device=/dev/video0 \
    --gpus all \
    --rm \
    deep-live-cam

zzzhengqi commented 3 weeks ago

您有不兼容的CUDA版本。依赖项中有onnx 1.16.3，需要CUDA 11.8（源代码）。

要解决这个问题，要么降级你的CUDA，要么在Docker容器中运行这个应用，比如nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

hello,Do you have any experience with this kind of situation? "E:\Deeplivecam\Deep-Live-Cam>pip install onnxruntime-gpu==1.16.3 ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu==1.16.3 (from versions: 1.17.0, 1.17.1, 1.18.0, 1.18.1, 1.19.0) ERROR: No matching distribution found for onnxruntime-gpu==1.16.3"

voorhs commented 3 weeks ago

您有不兼容的CUDA版本。依赖项中有onnx 1.16.3，需要CUDA 11.8（源代码）。要解决这个问题，要么降级你的CUDA，要么在Docker容器中运行这个应用，比如nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

hello,Do you have any experience with this kind of situation? "E:\Deeplivecam\Deep-Live-Cam>pip install onnxruntime-gpu==1.16.3 ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu==1.16.3 (from versions: 1.17.0, 1.17.1, 1.18.0, 1.18.1, 1.19.0) ERROR: No matching distribution found for onnxruntime-gpu==1.16.3"

No, i see it for the first time

hacksider / Deep-Live-Cam

libcufft.so.10: cannot open shared object file: No such file or directory #294