Closed suryyyansh closed 4 months ago
You can also add :
The command above starts the API server on the default socket address. Besides, there are also some other options specified in the command:
- The `--dir .:.` option specifies the current directory as the root directory of the WASI file system.
- The `--nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf` option specifies the Llama model to be used by the API server. The pattern of the argument is `<name>:<encoding>:<target>:<model path>`. Here, the model used is `llama-2-7b-chat.Q5_K_M.gguf`; and we give it an alias `default` as its name in the runtime environment. You can change the model name here if you're not using llama2-7b-chat
- The `--prompt-template llama-2-chat` is the prompt template for the model.
- The `--model-name llama-2-7b-chat` specifies the model name. It is used in the chat request.
as an explanation of the command to things a bit more clear.
Also this :
Please guarantee that the port is not occupied by other processes. If the port specified is available on your machine and the command is successful, you should see the following output in the terminal:
```console
[INFO] LlamaEdge HTTP listening on 8080
You can also add :
The command above starts the API server on the default socket address. Besides, there are also some other options specified in the command: - The `--dir .:.` option specifies the current directory as the root directory of the WASI file system. - The `--nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf` option specifies the Llama model to be used by the API server. The pattern of the argument is `<name>:<encoding>:<target>:<model path>`. Here, the model used is `llama-2-7b-chat.Q5_K_M.gguf`; and we give it an alias `default` as its name in the runtime environment. You can change the model name here if you're not using llama2-7b-chat - The `--prompt-template llama-2-chat` is the prompt template for the model. - The `--model-name llama-2-7b-chat` specifies the model name. It is used in the chat request.
as an explanation of the command to things a bit more clear.
Noted! Will add this to another commit in a bit.
While an example project is available at (Example-LlamaEdge-RAG), the README for this specific project doesn't highlight the crucial step of having
qdrant
running, and neither doeswasmedge
at execution time when it doesn't find aqdrant
instance active.This PR aims to make this information more obvious, and adds a final
example usage
section with a barebones example.Relevant issue