liltom-eth / llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.
MIT License
1.96k stars 201 forks source link

GPU CUDA not found And HFValidationError #82

Open HorrorBest opened 10 months ago

HorrorBest commented 10 months ago

Hey There, I am new to this so please consider that while writing you response.

So I read the readme and followed it... I didn't want to download the model "Llama-2-7b-Chat-GPTQ" through the terminal so I downloaded it manually and put it in the folder "./models" and then I ran the "app.py" file I got the following errors:

GPU CUDA not found. Traceback (most recent call last): File "...\llama2-webui\app.py", line 325, in main() File "...\llama2-webui\app.py", line 56, in main llama2_wrapper = LLAMA2_WRAPPER( File "...\llama2-webui\llama2_wrapper\model.py", line 98, in init self.init_tokenizer() File "...\llama2-webui\llama2_wrapper\model.py", line 116, in init_tokenizer self.tokenizer = LLAMA2_WRAPPER.create_llama2_tokenizer(self.model_path) File "...\llama2-webui\llama2_wrapper\model.py", line 160, in create_llama2_tokenizer tokenizer = AutoTokenizer.from_pretrained(model_path) File "...\llama2-webui\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 652, in from_pretrained tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs) File "...\llama2-webui\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 496, in get_tokenizer_config resolved_config_file = cached_file( File "...\llama2-webui\venv\lib\site-packages\transformers\utils\hub.py", line 417, in cached_file resolved_file = hf_hub_download( File "....\llama2-webui\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 110, in _inner_fn validate_repo_id(arg_value) File "...\llama2-webui\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 158, in validate_repo_id raise HFValidationError( huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/Llama-2-7b-Chat-GPTQ'. Use repo_type argument if needed.

When I searched online some people said to download the CUDA Drivers so I did but still it didn't fix the problem. I tried to put the absolute path of the model but still no luck. here is my ".env" file:

MODEL_PATH = "./models/Llama-2-7b-Chat-GPTQ"

BACKEND_TYPE = "gptq"

LOAD_IN_8BIT = True

MAX_MAX_NEW_TOKENS = 2048 DEFAULT_MAX_NEW_TOKENS = 1024 MAX_INPUT_TOKEN_LENGTH = 4000

DEFAULT_SYSTEM_PROMPT = "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."

I have Intel HD Graphics 540 GPU with no clue to how much Vram it has and I also have Nvidia M2000M with 4GB of Vram and 16GB of Ram Running on Windows 11