Cant load llama 3 safetensor model

Bedoshady commented 6 months ago

Describe the bug

I tried to load llama 3 8b safetensor model but it doesnt work.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Download .pth file from meta website of llama 3 8B model then use the convert_llama_weight_to_hf provided by hugging face to turn it into safetensor model then try to load it using web text ui.

Screenshot

No response

Logs

13:59:35-790143 INFO     Starting Text generation web UI

Running on local URL:  http://127.0.0.1:7860

14:00:04-293944 INFO     Loading "Meta-Llama-3-8B"
14:00:04-299943 INFO     TRANSFORMERS_PARAMS=
{'low_cpu_mem_usage': True, 'torch_dtype': torch.float16}

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████| 4/4 [07:17<00:00, 109.35s/it]
14:07:33-602648 ERROR    Failed to load the model.
Traceback (most recent call last):
  File "H:\Downloads\text-generation-webui-main\modules\ui_model_menu.py", line 247, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(selected_model, loader)
  File "H:\Downloads\text-generation-webui-main\modules\models.py", line 94, in load_model
    output = load_func_map[loader](model_name)
  File "H:\Downloads\text-generation-webui-main\modules\models.py", line 178, in huggingface_loader
    model = model.cuda()
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\transformers\modeling_utils.py", line 2664, in cuda
    return super().cuda(*args, **kwargs)
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\nn\modules\module.py", line 915, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\nn\modules\module.py", line 779, in _apply
    module._apply(fn)
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\nn\modules\module.py", line 804, in _apply
    param_applied = fn(param)
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\nn\modules\module.py", line 915, in <lambda>
    return self._apply(lambda t: t.cuda(device))
  File "H:\tools\MiniConda2\envs\tg\lib\site-packages\torch\cuda\__init__.py", line 284, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

System Info

Windows 10
Miniconda
nvidia GPU

nickpotafiy commented 6 months ago

This isn't a bug. You don't have torch + cuda installed.

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

Obviously install latest CUDA toolkit.

https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_local

Also you dont need to download .pth and convert to safetensors. You can grab safetensors from:

https://huggingface.co/meta-llama/Meta-Llama-3-8B/tree/main

Bedoshady commented 6 months ago

I downloaded cuba toolkit from https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_local

and I already downloaded torch from this command conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia instead of the one you provided, but it still gives the same error.

I downloaded Cuba and mini conda to custom files, is this the problem. Do I need to restart computer after downloading cuba or the error is something else. Thanks in advance for help.

Bedoshady commented 6 months ago

this is the automatic choices for loading model is this correct

nickpotafiy commented 6 months ago

Your snapshot shows your GPUs arent detected. Grab a clean copy of text-generation-webui and run start_windows.bat. Go through the installation process and follow the instructions. It should install everything you need, including torch+cuda. So long as you have Nvidia toolkit installed properly, this should work.

Keep in mind, latest webui uses torch 2.2.1+cu121. So your Nvidia toolkit should be at least 12.1. Get rid of any other versions you have installed, and make sure your environmental variables CUDA_HOME and CUDA_PATH point to your installed location like C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1

Bedoshady commented 6 months ago

I am getting Retrieving notices: ...working... OPENSSL_Uplink(00007FFFB6BABD50,08): no OPENSSL_Applink

Bedoshady commented 6 months ago

When I used the converted safetensor with web text ui, it didnt work. I tried to convert to gguf but it didnt work then I discovered that for some reason when converting .pth file to hf safetensor files were correct while the other files werent so I downloaded them of hugging face and tried again with web text ui and again it didnt work, but I converted the model to gguf successfully and now can run the model as gguf

stefanbeeman-em commented 2 months ago

I'm having a similar problem on Apple M2. Even when I direct the install script to use the CPU, any attempt to load a model fails because CUDA_HOME environment variable is not set.

oobabooga / text-generation-webui