oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
39.99k stars 5.25k forks source link

The GPU is not recognized, it's not possible to load any LLM, and the voice cloning has a bug, even with just 5 lines of text. Is there any relation between the two issues mentioned above? #6388

Open AntonielHUB opened 2 weeks ago

AntonielHUB commented 2 weeks ago

Describe the bug

Well, basically a summary of my problems: I am using the most up-to-date version of Ubuntu, where, by the way, I did a completely clean installation just to test the interface and use some LLMs. But, essentially, I can't use anything. Below, I'll list my setup as well as the errors encountered.

Is there an existing issue for this?

Reproduction

I ran start_linux.sh and selected my GPU model, which in this case is AMD. After downloading all the packages, I went to http://127.0.0.1:7860/. When I started the interface, I went to 'session' and selected 'coqui_tts'. I applied and hit restart. The interface couldn't activate coqui_tts, so I ran update_wizard_linux.sh, installed coqui_tts, and even updated the interface as well as the other plugins to see if the error would be resolved. After that, I repeated the previous step, which was basically starting the interface. I went to 'session', selected coqui_tts, applied, and restarted the interface. After that, I was able to activate coqui_tts, which is the feature I want to use. It works, but in a very poor way compared to any video you see on YouTube or the library's proposal. For example, in addition to that, I also downloaded LLMs for exclusive use through the GPU, but any model I download is not loaded and shows errors.

Screenshot

No response

Logs

*******************************************************************
* You haven't downloaded any model yet.
* Once the web UI launches, head over to the "Model" tab and download one.
*******************************************************************
/home/user/Downloads/text-generation-webui-main/installer_files/env/lib/
python3.11/site-packages/awq/modules/linear/gemm.py:14: UserWarning: AutoAWQ
could not load GEMM kernels extension. Details: No module named 'awq_ext'
warnings.warn(f"AutoAWQ could not load GEMM kernels extension. Details: {ex}")
/home/user/Downloads/text-generation-webui-main/installer_files/env/lib/
python3.11/site-packages/awq/modules/linear/gemv.py:11: UserWarning: AutoAWQ
could not load GEMV kernels extension. Details: No module named 'awq_ext'
warnings.warn(f"AutoAWQ could not load GEMV kernels extension. Details: {ex}")
/home/user/Downloads/text-generation-webui-main/installer_files/env/lib/
python3.11/site-packages/awq/modules/linear/gemv_fast.py:10: UserWarning:
AutoAWQ could not load GEMVFast kernels extension. Details: No module named
'awq_v2_ext'
warnings.warn(f"AutoAWQ could not load GEMVFast kernels extension. Details:
{ex}")
08:12:14-911245 INFO Starting Text generation web UI
08:12:14-915174 INFO Loading settings from "settings.yaml"
08:12:14-917418 INFO Loading the extension "coqui_tts"
[XTTS] Loading XTTS...
> tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
> Using model: xtts
/home/user/Downloads/text-generation-webui-main/installer_files/env/lib/
python3.11/site-packages/transformers/generation/configuration_utils.py:577:
UserWarning: `do_sample` is set to `False`. However, `min_p` is set to `0.0` --
this flag is only used in sample-based generation modes. You should set
`do_sample=True` or unset `min_p`.
warnings.warn(
[XTTS] Done!
Running on local URL: http://127.0.0.1:7860
08:12:58-479097 INFO Saved
"/home/user/Downloads/text-generation-webui-main/settin
gs.yaml".
Closing server running on port: 7860
08:13:00-983167 INFO Loading the extension "coqui_tts"
Running on local URL: http://127.0.0.1:7860
Downloading the model to models/TheBloke_dolphin-2.7-mixtral-8x7b-AWQ
README.md: 100%|██████████████████████████████████████████████████| 20.7k/20.7k
eval_results.json: 100%|██████████████████████████████████████████████████| 3.00
generation_config.json: 100%|██████████████████████████████████████████████████|
config.json: 100%|██████████████████████████████████████████████████| 1.03k/1.03
added_tokens.json: 100%|██████████████████████████████████████████████████| 51.0
model.safetensors.index.json: 100%|█████████████████████████████████████████████
quant_config.json: 100%|██████████████████████████████████████████████████| 144/
special_tokens_map.json: 100%|██████████████████████████████████████████████████
tokenizer.json: 100%|██████████████████████████████████████████████████| 1.71M/1
tokenizer.model: 100%|██████████████████████████████████████████████████| 482k/4
tokenizer_config.json: 100%|██████████████████████████████████████████████████|
model-00003-of-00003.safetensors: 100%|█████████████████████████████████████████
model-00001-of-00003.safetensors: 100%|█████████████████████████████████████████
model-00002-of-00003.safetensors: 100%|█████████████████████████████████████████
08:35:23-469744 INFO Loading "TheBloke_dolphin-2.7-mixtral-8x7b-AWQ"
08:35:23-471485 INFO TRANSFORMERS_PARAMS=
{'low_cpu_mem_usage': True, 'torch_dtype': torch.float16}
08:35:23-545591 ERROR Failed to load the model.
Traceback (most recent call last):
File
"/home/user/Downloads/text-generation-webui-main/modules/ui_model_menu.py", line
231, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/modules/models.py", line
93, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/modules/models.py", line
172, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line
564, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/modeling_utils.py", line 3388, in
from_pretrained
config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/quantizers/auto.py", line 161, in
merge_quantization_configs
quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/quantizers/auto.py", line 91, in
from_dict
return target_cls.from_dict(quantization_config_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/utils/quantization_config.py", line
97, in from_dict
config = cls(**config_dict)
^^^^^^^^^^^^^^^^^^
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/utils/quantization_config.py", line
814, in __init__
self.post_init()
File "/home/user/Downloads/text-generation-webui-main/installer_files/env/
lib/python3.11/site-packages/transformers/utils/quantization_config.py", line
821, in post_init
raise ValueError("AWQ is only available on GPU")
ValueError: AWQ is only available on GPU.

System Info

i9 11900H ES (sample engineering)
GPU: RX 6600m
RAM: 16GB
Smarandii commented 4 days ago

I have the same problem

Smarandii commented 4 days ago

@AntonielHUB I've managed to solve issue for me.

  1. Go to text-generation-webui folder

  2. Run cmd_windows.bat or cmd_linux.sh or cmd_macos.sh depending on your operating system

  3. Then Check CUDA Installation: Ensure that CUDA is properly installed on your system. You can verify this by running the following command in your terminal:

    nvcc --version

    This will show you the installed version of CUDA. Make sure that it's a version supported by PyTorch (e.g., CUDA 12.1, 11.7, or 11.8).

  4. Uninstall Current PyTorch Installation: If you're using a version of PyTorch that doesn't match your CUDA version, you may encounter issues. Uninstall the current version of PyTorch:

    pip uninstall torch torchvision torchaudio
  5. Reinstall PyTorch with CUDA Support: Based on your CUDA version, reinstall PyTorch using the corresponding index URL. For instance, if you have CUDA 12.1 installed, use the following command:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
  1. Check if that worked:
    python
import torch
print(torch.cuda.is_available())  # This should return True if GPU is detected.
  1. Restart text-generation-webui and try to load any AWQ model - this should work now.