LlamaEdge / rag-api-server

A RAG API server written in Rust following OpenAI specs
https://llamaedge.com/docs/user-guide/server-side-rag/quick-start
Apache License 2.0
21 stars 7 forks source link

Make README instructions more usable #5

Closed suryyyansh closed 4 months ago

suryyyansh commented 4 months ago

While an example project is available at (Example-LlamaEdge-RAG), the README for this specific project doesn't highlight the crucial step of having qdrant running, and neither does wasmedge at execution time when it doesn't find a qdrant instance active.

This PR aims to make this information more obvious, and adds a final example usage section with a barebones example.

Relevant issue

harsh-ps-2003 commented 4 months ago

You can also add :

The command above starts the API server on the default socket address. Besides, there are also some other options specified in the command:

- The `--dir .:.` option specifies the current directory as the root directory of the WASI file system.

- The `--nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf` option specifies the Llama model to be used by the API server. The pattern of the argument is `<name>:<encoding>:<target>:<model path>`. Here, the model used is `llama-2-7b-chat.Q5_K_M.gguf`; and we give it an alias `default` as its name in the runtime environment. You can change the model name here if you're not using llama2-7b-chat
- The `--prompt-template llama-2-chat` is the prompt template for the model.
- The `--model-name llama-2-7b-chat` specifies the model name. It is used in the chat request.

as an explanation of the command to things a bit more clear.

harsh-ps-2003 commented 4 months ago

Also this :

Please guarantee that the port is not occupied by other processes. If the port specified is available on your machine and the command is successful, you should see the following output in the terminal:

```console
[INFO] LlamaEdge HTTP listening on 8080
suryyyansh commented 4 months ago

You can also add :

The command above starts the API server on the default socket address. Besides, there are also some other options specified in the command:

- The `--dir .:.` option specifies the current directory as the root directory of the WASI file system.

- The `--nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf` option specifies the Llama model to be used by the API server. The pattern of the argument is `<name>:<encoding>:<target>:<model path>`. Here, the model used is `llama-2-7b-chat.Q5_K_M.gguf`; and we give it an alias `default` as its name in the runtime environment. You can change the model name here if you're not using llama2-7b-chat
- The `--prompt-template llama-2-chat` is the prompt template for the model.
- The `--model-name llama-2-7b-chat` specifies the model name. It is used in the chat request.

as an explanation of the command to things a bit more clear.

Noted! Will add this to another commit in a bit.