Closed davidrpugh closed 4 weeks ago
Related examples can be added to the llama-cli
tutorial notebooks. Quickstart notebook should show how to download a model from HF.
- `-mu MODEL_URL --model-url MODEL_URL`: Specify a remote http url to download the file.
Here is an example of usage.
MODEL_URL=https://huggingface.co/ggml-org/gemma-1.1-7b-it-Q4_K_M-GGUF/resolve/main/gemma-1.1-7b-it.Q4_K_M.gguf
llama-cli --model-url "$MODEL_URL" --prompt "Once upon a time"
This requires a working with a build that has enabled curl support. See issue #11 .
Other relevant HF related options for llama-cli
.
-hfr, --hf-repo REPO Hugging Face model repository (default: unused)
(env: LLAMA_ARG_HF_REPO)
-hff, --hf-file FILE Hugging Face model file (default: unused)
(env: LLAMA_ARG_HF_FILE)
-hft, --hf-token TOKEN Hugging Face access token (default: value from HF_TOKEN environment
variable)
(env: HF_TOKEN)
Closed by PR #28.
There is a nice example in the HF docs showing how you can use both
llama-cli
andllama-server
to work with GGUF files directly from HF.https://huggingface.co/docs/hub/en/gguf-llamacpp
These are good examples to include in our tutorials.