Closed rchan26 closed 1 year ago
Implement llama-cpp model in query engine.
For example, after setting up your slack tokens (see README), you should be able to load up an instance (using the handbook data):
python slack_bot/run.py --model llama-index-llama-cpp --data data --which-index handbook --n-gpu-layers 1 --model-name gguf_models/llama-2-13b-chat.Q6_K.gguf --path
given that you're in the same directory as llama-2-13b-chat.Q6_K.gguf. Note the --path to signify the model name is a path. Equivalently, you could have:
llama-2-13b-chat.Q6_K.gguf
--path
python slack_bot/run.py -m llama-index-llama-cpp -d data -w handbook -ngl 1 -n https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF/resolve/main/llama-2-13b-chat.Q6_K.gguf
This just downloads it straight from the URL instead.
Implement llama-cpp model in query engine.
For example, after setting up your slack tokens (see README), you should be able to load up an instance (using the handbook data):
given that you're in the same directory as
llama-2-13b-chat.Q6_K.gguf
. Note the--path
to signify the model name is a path. Equivalently, you could have:This just downloads it straight from the URL instead.