Problems initializing Cuda

Following the instructions to clone and build the whisper docker image, I get this error on my Ubuntu 24.04 laptop with an nVidia RTX A2000 GPU:

$ docker run --gpus all -it -v ${PWD}/models:/root/.cache/whisper -v ${PWD}/audio-files:/app openai-whisper-image whisper audio-file.mp3 --device cuda --model large-v3 --language Italian --output_dir /app --output_format txt
/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py:129: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at ../c10/cuda/CUDAFunctions.cpp:108.)
  return torch._C._cuda_getDeviceCount() > 0
/usr/local/lib/python3.10/dist-packages/whisper/__init__.py:150: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(fp, map_location=device)
Traceback (most recent call last):
  File "/usr/local/bin/whisper", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/dist-packages/whisper/transcribe.py", line 577, in cli
    model = load_model(model_name, device=device, download_root=model_dir)
  File "/usr/local/lib/python3.10/dist-packages/whisper/__init__.py", line 150, in load_model
    checkpoint = torch.load(fp, map_location=device)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1360, in load
    return _load(
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1848, in _load
    result = unpickler.load()
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1812, in persistent_load
    typed_storage = load_tensor(
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1784, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1685, in restore_location
    return default_restore_location(storage, map_location)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 601, in default_restore_location
    result = fn(storage, location)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 539, in _deserialize
    device = _validate_device(location, backend_name)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 508, in _validate_device
    raise RuntimeError(
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I named my image docker-whisper-image. The rest of the command is taken straight from the README.md

When running Whisper, the output confirms that it's not using Cuda:

Nvidia-smi is installed and working it seems:

$ docker run --gpus all -it openai-whisper-image nvidia-smi
Wed Nov 20 13:28:50 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX 2000 Ada Gene...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   54C    P8               1W /  35W |      7MiB /  8188MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

On the host same output:

$ nvidia-smi
Wed Nov 20 14:31:29 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX 2000 Ada Gene...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   54C    P8               2W /  35W |      7MiB /  8188MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      3557      G   /usr/bin/gnome-shell                          2MiB |
+---------------------------------------------------------------------------------------+

My output:

docker run --gpus all nvidia/cuda:12.4.0-base-ubuntu20.04 nvidia-smi
Wed Nov 20 15:57:18 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080 Ti     Off |   00000000:03:00.0  On |                  N/A |
|  0%   50C    P8             32W /  350W |      51MiB /  12288MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

 docker run --gpus all -it nvidia/cuda:12.4.0-devel-ubuntu20.04 nvcc --version

==========
== CUDA ==
==========

CUDA Version 12.4.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

manzolo / openai-whisper-docker

Problems initializing Cuda #3