Error running run_localGPT.py with Mac M2

manuel2f commented 11 months ago

I'm trying to probe the default case ,but it doesn't works.

(localGPT) ➜ localGPT git:(main) ✗ CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

(localGPT) ➜ localGPT git:(main) ✗ python run_localGPT.py --device_type mps 2023-09-20 09:54:57,306 - INFO - run_localGPT.py:221 - Running on: mps 2023-09-20 09:54:57,306 - INFO - run_localGPT.py:222 - Display Source Documents set to: False 2023-09-20 09:54:57,306 - INFO - run_localGPT.py:223 - Use history set to: False 2023-09-20 09:54:57,575 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large load INSTRUCTOR_Transformer max_seq_length 512 2023-09-20 09:54:59,820 - INFO - posthog.py:16 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. 2023-09-20 09:54:59,867 - INFO - run_localGPT.py:56 - Loading Model: TheBloke/Llama-2-7b-Chat-GGUF, on: mps 2023-09-20 09:54:59,867 - INFO - run_localGPT.py:57 - This action can take a few minutes! 2023-09-20 09:54:59,867 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models Traceback (most recent call last): File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 258, in main() File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(args, kwargs) File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 229, in main qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type="llama") File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 144, in retrieval_qa_pipline qa = RetrievalQA.from_chain_type( File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/chains/retrieval_qa/base.py", line 100, in from_chain_type combine_documents_chain = load_qa_chain( File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/chains/question_answering/init.py", line 249, in load_qa_chain return loader_mapping[chain_type]( File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/chains/question_answering/init.py", line 73, in _load_stuff_chain llm_chain = LLMChain( File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/load/serializable.py", line 74, in init super().init(kwargs) File "pydantic/main.py", line 341, in pydantic.main.BaseModel.init pydantic.error_wrappers.ValidationError: 1 validation error for LLMChain llm none is not an allowed value (type=type_error.none.not_allowed)

manuel2f commented 11 months ago

Also, I'm trying whith another model: (localGPT) ➜ localGPT git:(main) ✗ python run_localGPT.py --device_type mps 2023-09-20 10:13:30,116 - INFO - run_localGPT.py:221 - Running on: mps 2023-09-20 10:13:30,116 - INFO - run_localGPT.py:222 - Display Source Documents set to: False 2023-09-20 10:13:30,116 - INFO - run_localGPT.py:223 - Use history set to: False 2023-09-20 10:13:30,334 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large load INSTRUCTOR_Transformer max_seq_length 512 2023-09-20 10:13:32,798 - INFO - posthog.py:16 - Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. 2023-09-20 10:13:32,845 - INFO - run_localGPT.py:56 - Loading Model: TheBloke/Llama-2-7B-Chat-GGML, on: mps 2023-09-20 10:13:32,845 - INFO - run_localGPT.py:57 - This action can take a few minutes! 2023-09-20 10:13:32,845 - INFO - load_models.py:38 - Using Llamacpp for GGUF/GGML quantized models Traceback (most recent call last): File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/llms/llamacpp.py", line 149, in validate_environment from llama_cpp import Llama ModuleNotFoundError: No module named 'llama_cpp'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/admin/Downloads/llama2/localGPT/load_models.py", line 56, in load_quantized_model_gguf_ggml return LlamaCpp(kwargs) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/load/serializable.py", line 74, in init super().init(kwargs) File "pydantic/main.py", line 339, in pydantic.main.BaseModel.init File "pydantic/main.py", line 1102, in pydantic.main.validate_model File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/langchain/llms/llamacpp.py", line 153, in validate_environment raise ImportError( ImportError: Could not import llama-cpp-python library. Please install the llama-cpp-python library to use this embedding model: pip install llama-cpp-python

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 258, in main() File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/homebrew/Caskroom/miniconda/base/envs/localGPT/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(args, **kwargs) File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 229, in main qa = retrieval_qa_pipline(device_type, use_history, promptTemplate_type="llama") File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 132, in retrieval_qa_pipline llm = load_model(device_type, model_id=MODEL_ID, model_basename=MODEL_BASENAME, LOGGING=logging) File "/Users/admin/Downloads/llama2/localGPT/run_localGPT.py", line 64, in load_model model, tokenizer = load_quantized_model_gguf_ggml(model_id, model_basename, device_type, LOGGING) File "/Users/admin/Downloads/llama2/localGPT/load_models.py", line 59, in load_quantized_model_gguf_ggml logging.INFO("If you were using GGML model, LLAMA-CPP Dropped Support, Use GGUF Instead") TypeError: 'int' object is not callable

jamoram commented 11 months ago

looks like the problem is with llama_cpp. try to reinstall using the explanation of https://www.youtube.com/watch?v=G_prHSKX9d4&t=243s&ab_channel=PromptEngineering

kime541200 commented 11 months ago

I have the same problem, is there any solution? I have followed the installation steps to set up the environment. PS: My OS is Windows, GPU is RTX3060.

FeRm00 commented 11 months ago

I had the same problem, and @jamoram answered the solution:

set CMAKE_ARGS=-DLLAMA_CUBLAS=on

set FORCE_CMAKE=1

And now, if you are using a GGUF language model then:

pip install llama-cpp-python==0.1.83

If you using a GGML:

pip install llama-cpp-python==0.1.76

In constants.py is the model you are using. The default is:

MODEL_ID = "TheBloke/Llama-2-7b-Chat-GGUF" MODEL_BASENAME = "llama-2-7b-chat.Q4_K_M.gguf"

lwh8915 commented 9 months ago

@FeRm00 ,I have set it up according to this, but I still get the same error.

Ath3neNoctua commented 8 months ago

Same issue with same setup

PromtEngineer / localGPT

Error running run_localGPT.py with Mac M2 #501