Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.
Hey There, I am new to this so please consider that while writing you response.
So I read the readme and followed it... I didn't want to download the model "Llama-2-7b-Chat-GPTQ" through the terminal so I downloaded it manually and put it in the folder "./models" and then I ran the "app.py" file I got the following errors:
GPU CUDA not found.
Traceback (most recent call last):
File "...\llama2-webui\app.py", line 325, in
main()
File "...\llama2-webui\app.py", line 56, in main
llama2_wrapper = LLAMA2_WRAPPER(
File "...\llama2-webui\llama2_wrapper\model.py", line 98, in init
self.init_tokenizer()
File "...\llama2-webui\llama2_wrapper\model.py", line 116, in init_tokenizer
self.tokenizer = LLAMA2_WRAPPER.create_llama2_tokenizer(self.model_path)
File "...\llama2-webui\llama2_wrapper\model.py", line 160, in create_llama2_tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)
File "...\llama2-webui\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 652, in from_pretrained
tokenizer_config = get_tokenizer_config(pretrained_model_name_or_path, **kwargs)
File "...\llama2-webui\venv\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 496, in get_tokenizer_config
resolved_config_file = cached_file(
File "...\llama2-webui\venv\lib\site-packages\transformers\utils\hub.py", line 417, in cached_file
resolved_file = hf_hub_download(
File "....\llama2-webui\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 110, in _inner_fn
validate_repo_id(arg_value)
File "...\llama2-webui\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 158, in validate_repo_id
raise HFValidationError(
huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/Llama-2-7b-Chat-GPTQ'. Use repo_type argument if needed.
When I searched online some people said to download the CUDA Drivers so I did but still it didn't fix the problem. I tried to put the absolute path of the model but still no luck. here is my ".env" file:
DEFAULT_SYSTEM_PROMPT = "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information."
I have Intel HD Graphics 540 GPU with no clue to how much Vram it has and I also have Nvidia M2000M with 4GB of Vram and 16GB of Ram Running on Windows 11
Hey There, I am new to this so please consider that while writing you response.
So I read the readme and followed it... I didn't want to download the model "Llama-2-7b-Chat-GPTQ" through the terminal so I downloaded it manually and put it in the folder "./models" and then I ran the "app.py" file I got the following errors:
When I searched online some people said to download the CUDA Drivers so I did but still it didn't fix the problem. I tried to put the absolute path of the model but still no luck. here is my ".env" file:
I have Intel HD Graphics 540 GPU with no clue to how much Vram it has and I also have Nvidia M2000M with 4GB of Vram and 16GB of Ram Running on Windows 11