User provided device_type of 'cuda', but CUDA is not available. Disabling

MagicChakram commented 8 months ago

I have a system with rtx3060M Win10 (fresh install: only git, miniconda, cuda, nv driver and tortoise tts) Nvidia driver is also installed Cuda is 12.3 (also tried 11.7 - result the same) when typing in terminal:

(tortoise) C:\Users\user\Desktop\tortoise-tts>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:56:38_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0

but when doing a test tts, after downloading files:

Generating autoregressive samples.. C:\ProgramData\miniconda3\envs\tortoise\lib\site-packages\torch\amp\autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling

and then it uses cpu to generate samples. I need to have it done with gpu. Please help

fkw11 commented 8 months ago

I'm getting the same issue and another issue. I followed the instructions exactly. On Linux Mint. Have a 3070 TI.

python tortoise/do_tts.py --text "I'm going to speak this" --voice random --preset fast /home/user/miniconda3/envs/tortoise/lib/python3.9/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") Generating autoregressive samples.. /home/user/miniconda3/envs/tortoise/lib/python3.9/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling warnings.warn(

tmceld commented 8 months ago

I wouldn't normally +1 this, but I saw that a similar issue has recently been closed, so feel I should also add my voice, I am also having this problem - pretty fresh install on pop_os (ubuntu)

manmay-nakhashi commented 8 months ago

Check if your installed torch has cuda support normally I check it with this import torch torch.cuda.is_available() Just check if this is true cuda is available, if not install the torch from conda , follow instructions from pytorch.org website

tmceld commented 8 months ago

I did that, its not.

I just noticed on the readme:

pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu

? should we be installing the nightly CPU version?

heldenby commented 8 months ago

I have exactly the same error on Windows 10, GTX980Ti I've had this working in the past on a full anaconda install, but recently did a fresh install on miniconda

heldenby commented 8 months ago

FWIW I decided to uninstall miniconda, reinstall full anaconda.. Repeated the tortoise-tts install exactly the same - and its now using CUDA as before.

MagicChakram commented 8 months ago

Check if your installed torch has cuda support normally I check it with this import torch torch.cuda.is_available() Just check if this is true cuda is available, if not install the torch from conda , follow instructions from pytorch.org website

thing is, that couple weeks ago I did all according to listed tts manual and all worked. On both ubuntu and windows systems (I have a spare ssd to experiment around and reinstall if needed) and on fresh installs, given that you have cuda installed and drivers - all worked. Now, doing the same - it throws cuda error. So I reckon it is something with recent update.

Thanks @heldenby I will give a try to installing not miniconda, but conda instead and also wil try to install older release of tts. Will edit once have a result

MagicChakram commented 8 months ago

unfortunately uninstalling miniconda, installing conda brings to the same result. Also tried to git reset --hard f7ed535 and do installation of older version (to see if next commits broke something up) and cuda is still not recognised. when trying to run with latest repo version and trying to do a fastread, it gave bit more details:

(tortoise) C:\Users\user\Desktop\tortoise-tts>python tortoise/read_fast.py --textfile 1.txt --voice random
C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Downloading hifidecoder.pth: 100%|████████████████████████████████████████████████| 71.5M/71.5M [00:04<00:00, 17.1MB/s]
C:\Users\user\.conda\envs\tortoise\lib\site-packages\huggingface_hub\file_download.py:137: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\user\.cache\tortoise\models. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
  warnings.warn(message)
Traceback (most recent call last):
  File "C:\Users\user\Desktop\tortoise-tts\tortoise\read_fast.py", line 33, in <module>
    tts = TextToSpeech(models_dir=args.model_dir, use_deepspeed=args.use_deepspeed, kv_cache=args.kv_cache, half=args.half)
  File "C:\Users\user\Desktop\tortoise-tts\tortoise\api_fast.py", line 222, in __init__
    hifi_model = torch.load(get_model_path('hifidecoder.pth'))
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 1014, in load
    return _load(opened_zipfile,
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 1422, in _load
    result = unpickler.load()
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 1392, in persistent_load
    typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 1366, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 381, in default_restore_location
    result = fn(storage, location)
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 274, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "C:\Users\user\.conda\envs\tortoise\lib\site-packages\torch\serialization.py", line 258, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: **Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False**. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

I will try to wipe out ssd, reinstall windows, cuda and driver and go with conda, to see if it helps...

JuliusH93 commented 8 months ago

I had exactly the same issue and struggled with it for some hours. The solution for me was to do a clean install of torch like described here: https://github.com/pytorch/pytorch/issues/30664#issuecomment-1707395814

pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

fkw11 commented 8 months ago

I had exactly the same issue and struggled with it for some hours. The solution for me was to do a clean install of torch like described here: pytorch/pytorch#30664 (comment)

pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

This did not resolve the issue.

MagicChakram commented 8 months ago

Tried to take the command from https://pytorch.org/get-started/locally/ conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia Did not help. tried to git reset --hard repo to commit 5bbb0e0 Did not help.

But what helped is:

install torch from conda with 12 cuda: conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

meanwhile if doing as written in repo: returns an error (I reproduced it on many newly fresh installs)

If using cuda 11.7 what helped to me was to reset repo to mentioned above commit 5bbb0e0 and then execute python install script. You will have DS and other errors that are normal on windows (had this in the past, I think that in the past dependencies were a mess, or something was running out of ram and not installed - idk) - and this comment helps to solve it. But if doing so - you will lack nice new feature about faster inference read.

I want to say big thanks to all devs and contributors for all work that is being done and to all who tried to help here. One more thing I noticed is that rtx 3060 has 256 samples used when reading with both standard and high_quality presets, but I guess I will open a separate issue on this one (tried with 3080 and all was ok)

RichardTang75 commented 8 months ago

I had the same problem using the docker image, but using an older version of pytorch helped me. conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia

barugamon commented 8 months ago

Recreating the conda environment, then installing pytorch and transformers with pip instead of conda solved this issue for me:

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install transformers==4.29.2

edward93 commented 7 months ago

I had exactly the same issue and struggled with it for some hours. The solution for me was to do a clean install of torch like described here: pytorch/pytorch#30664 (comment)

pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

This worked for me!

peter064226 commented 7 months ago

I had the same problem using the docker image, but using an older version of pytorch helped me. conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia

same issue , working for me . u save my life . thank u .

jaffster595 commented 4 months ago

I had exactly the same issue and struggled with it for some hours. The solution for me was to do a clean install of torch like described here: pytorch/pytorch#30664 (comment)

pip uninstall torch pip cache purge pip install torch -f https://download.pytorch.org/whl/torch_stable.html

Confirmed that this worked for me. I actually had more issues when using Miniconda which I eventually gave up on. Installing using Pip got me up to the point of the error in this thread, then your suggestion worked.

csaben commented 3 months ago

Recreating the conda environment, then installing pytorch and transformers with pip instead of conda solved this issue for me:

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install transformers==4.29.2

your suggestions worked! Here is a gist of it all that I used to get everything working.

traysonkelii commented 3 months ago

Recreating the conda environment, then installing pytorch and transformers with pip instead of conda solved this issue for me:

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install transformers==4.29.2

This also solved it for me (On windows 11). Thank you

JeremyBickel commented 3 months ago

The torch versions are mixed in the Dockerfile. According to ChatGPT (no guarantees - I did it manually and didn't test this file), the new Dockerfile should be:

FROM nvidia/cuda:12.2.0-base-ubuntu22.04

COPY . /app

RUN apt-get update && \ apt-get install -y --allow-unauthenticated --no-install-recommends \ wget \ git \ && apt-get autoremove -y \ && apt-get clean -y \ && rm -rf /var/lib/apt/lists/*

ENV HOME "/root" ENV CONDA_DIR "${HOME}/miniconda" ENV PATH="$CONDA_DIR/bin":$PATH ENV CONDA_AUTO_UPDATE_CONDA=false ENV PIP_DOWNLOAD_CACHE="$HOME/.pip/cache" ENV TORTOISE_MODELS_DIR="$HOME/tortoise-tts/build/lib/tortoise/models"

RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda3.sh \ && bash /tmp/miniconda3.sh -b -p "${CONDA_DIR}" -f -u \ && "${CONDA_DIR}/bin/conda" init bash \ && rm -f /tmp/miniconda3.sh \ && echo ". '${CONDA_DIR}/etc/profile.d/conda.sh'" >> "${HOME}/.profile"

# --login option used to source bashrc (thus activating conda env) at every RUN statement SHELL ["/bin/bash", "--login", "-c"]

RUN conda create --name tortoise python=3.9 numba inflect \ && conda activate tortoise \ && conda install pytorch=2.2.1=py3.9_cuda12.1_cudnn8.9.2_0 pytorch-cuda=12.1 torchvision torchaudio -c pytorch -c nvidia \ && conda install transformers=4.31.0 \ && cd /app \ && python setup.py install

0xMatthew commented 2 months ago

Recreating the conda environment, then installing pytorch and transformers with pip instead of conda solved this issue for me:

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install transformers==4.29.2

a couple of these suggestions worked for me, but this one was the fastest. thanks for posting. :)

moluuser commented 2 months ago

@0xMatthew Can you share your CUDA version? It's not work for me. Driver Version: 535.146.02 CUDA Version: 12.2

moluuser commented 2 months ago

Strangely, the following code can output normally, but running the project still got "CUDA is not available. Disabling". I tried all the methods above.

import torch
use_cuda = torch.cuda.is_available()

if not use_cuda:
    exit()

print(torch.cuda.get_device_name(0))
print('__CUDNN VERSION:', torch.backends.cudnn.version())
print('__Number CUDA Devices:', torch.cuda.device_count())
print('__CUDA Device Name:',torch.cuda.get_device_name(0))
print('__CUDA Device Total Memory [GB]:',torch.cuda.get_device_properties(0).total_memory/1e9)
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

0xMatthew commented 2 months ago

@0xMatthew Can you share your CUDA version? It's not work for me. Driver Version: 535.146.02 CUDA Version: 12.2

from nvidia-smi: Driver Version: 552.22 CUDA Version: 12.4

blackcatstudiosdevelopment commented 2 months ago

The torch versions are mixed in the Dockerfile. According to ChatGPT (no guarantees - I did it manually and didn't test this file), the new Dockerfile should be:

FROM nvidia/cuda:12.2.0-base-ubuntu22.04 COPY . /app RUN apt-get update && apt-get install -y --allow-unauthenticated --no-install-recommends wget git && apt-get autoremove -y && apt-get clean -y && rm -rf /var/lib/apt/lists/* ENV HOME "/root" ENV CONDA_DIR "${HOME}/miniconda" ENV PATH="$CONDA_DIR/bin":$PATH ENV CONDA_AUTO_UPDATE_CONDA=false ENV PIP_DOWNLOAD_CACHE="$HOME/.pip/cache" ENV TORTOISE_MODELS_DIR="$HOME/tortoise-tts/build/lib/tortoise/models" RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda3.sh && bash /tmp/miniconda3.sh -b -p "${CONDA_DIR}" -f -u && "${CONDA_DIR}/bin/conda" init bash && rm -f /tmp/miniconda3.sh && echo ". '${CONDA_DIR}/etc/profile.d/conda.sh'" >> "${HOME}/.profile"

--login option used to source bashrc (thus activating conda env) at every RUN statement

SHELL ["/bin/bash", "--login", "-c"] RUN conda create --name tortoise python=3.9 numba inflect && conda activate tortoise && conda install pytorch=2.2.1=py3.9_cuda12.1_cudnn8.9.2_0 pytorch-cuda=12.1 torchvision torchaudio -c pytorch -c nvidia && conda install transformers=4.31.0 && cd /app && python setup.py install

This should add the --yes flag for conda or it pauses for way too long.

FROM nvidia/cuda:12.2.0-base-ubuntu22.04

COPY . /app

RUN apt-get update && \
    apt-get install -y --allow-unauthenticated --no-install-recommends \
    wget \
    git \
    && apt-get autoremove -y \
    && apt-get clean -y \
    && rm -rf /var/lib/apt/lists/*

ENV HOME "/root"
ENV CONDA_DIR "${HOME}/miniconda"
ENV PATH="$CONDA_DIR/bin":$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false
ENV PIP_DOWNLOAD_CACHE="$HOME/.pip/cache"
ENV TORTOISE_MODELS_DIR="$HOME/tortoise-tts/build/lib/tortoise/models"

RUN wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O /tmp/miniconda3.sh \
    && bash /tmp/miniconda3.sh -b -p "${CONDA_DIR}" -f -u \
    && "${CONDA_DIR}/bin/conda" init bash \
    && rm -f /tmp/miniconda3.sh \
    && echo ". '${CONDA_DIR}/etc/profile.d/conda.sh'" >> "${HOME}/.profile"

# --login option used to source bashrc (thus activating conda env) at every RUN statement
SHELL ["/bin/bash", "--login", "-c"]

RUN conda create --name tortoise python=3.9 numba inflect \
    && conda activate tortoise \
    && conda install pytorch=2.2.1=py3.9_cuda12.1_cudnn8.9.2_0 pytorch-cuda=12.1 torchvision torchaudio -c pytorch -c nvidia --yes \
    && conda install transformers=4.31.0 --yes \
    && cd /app \
    && python setup.py install

neonbjb / tortoise-tts

User provided device_type of 'cuda', but CUDA is not available. Disabling #648

--login option used to source bashrc (thus activating conda env) at every RUN statement