savbell / whisper-writer

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.
GNU General Public License v3.0
244 stars 40 forks source link

CUDA requirements need to match project dependencies. Distribute the libraries? #33

Open santiago-afonso opened 5 months ago

santiago-afonso commented 5 months ago

I absolutely love this project, the hold_to_record functionality is excellent! Thanks for sharing! On to the issue:

Not so expert users like me could probably use more detailed instructions on how to get GPU acceleration on Windows. README.md lists the requirements for GPU acceleration as "cuBLAS for CUDA 11 and cuDNN 8 for CUDA 11". There are some issues here:

  1. The instructions don't specify if those libraries should be installed to Windows or to the Python environment. I guess that it is in Windows.
  2. cuBLAS cannot be downloaded on its own, it can either be obtained from the CUDA toolkit or the HPC SDK. I guess that we should install the former, but it should be clarified in the instructions.
  3. The requirements.txt lists a specific version of torch (PyTorch), 2.0.1, which apparently only works with CUDA 11.8 or 12.1. (and it should match the cuDNN 8 library version and the torch version, as both have different libraries for CUDA 11.x and CUDA 12.x) (see https://pytorch.org/get-started/previous-versions/)
  4. The current CUDA toolkit version is 12.3, the one offered for download by default by nVidia, and apparently incompatible with the PyTorch version used in the project
  5. No specific sub-version of cuDNN 8 is mentioned in the README.md as required (8.x? which x?).
  6. The current cuDNN version is 9.0.0 and the one offered for download by default by nVidia
  7. This is the main issue cuDNN archival versions 8.x (https://developer.nvidia.com/rdp/cudnn-archive) are offered without a Windows installer, only as a collection of .dll files

Solutions:

I solved it, but I don't know how because I implemented both of these attempted fixes at the same time:

  1. I upgraded torch running pip3 install torch torchaudio --index-url https://download.pytorch.org/whl/cu118
  2. I copied the CUDA dlls as distributed by faster-whisper (https://github.com/Purfview/whisper-standalone-win/releases/tag/libs) to the Windows System32 folder. Note that nobody should ever, ever, do this. So it'd be great if whisper-writer could distribute the libraries and import them directly as suggested here https://github.com/SYSTRAN/faster-whisper/issues/153#issuecomment-1853138605
Enyium commented 4 months ago

I also had problems with getting CUDA to run. WhisperWriter told me CUDA not available. I had CUDA 12.1 and CUDNN 9.0 installed.

Later in the process, I noticed that the PATH contained C:\Program Files\NVIDIA\CUDNN\v9.0\bin, although the DLLs are in an additional subfolder that's named after the CUDA version. Someone having problems could first try to change the PATH directory to the subfolder for the respective CUDA version and see if that helps.

Anyways, I did the following and now WhisperWriter works for me:

EDIT: After restarting, WhisperWriter complained that a DLL belonging to CUDA v12 was missing. I then added the two v12.3 paths I still had installed to my PATH environment variable additionally to the v11.8 paths and it worked again (i.e., C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin and C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp).

santiago-afonso commented 4 months ago

I moved the repository to a different drive and all CUDA dependencies broke for some reason. I got it to run again, but it repeats the transcribed text several times on the Terminal, and only pastes half-way at the cursor position. Also got different dependency issues from last time.