LlamaEdge / rag-api-server

A RAG API server written in Rust following OpenAI specs
https://llamaedge.com/docs/user-guide/server-side-rag/quick-start
Apache License 2.0
21 stars 7 forks source link

My local RAG return garbled text #12

Open chengr4 opened 3 months ago

chengr4 commented 3 months ago

Hi,

I followed steps in README but failed at the latest step. The return from the prompt is garbled text.

Screenshot 2024-05-24 at 11 49 37 AM

Wish anyone can guide me to the right path. 🫠

But the server's search seems to be correct? 🤔

Screenshot 2024-05-24 at 11 54 05 AM

Have Done

juntao commented 3 months ago

Can you paste the command you used to start the server?

chengr4 commented 3 months ago

I copy-paste the one in README

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-2-7b-chat-hf-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
    rag-api-server.wasm \
    --model-name Llama-2-7b-chat-hf-Q5_K_M,all-MiniLM-L6-v2-ggml-model-f16 \
    --ctx-size 4096,384 \
    --prompt-template llama-2-chat \
    --rag-prompt "Use the following pieces of context to answer the user's question.\nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n" \
    --log-prompts \
    --log-stat
Screenshot 2024-05-24 at 3 01 08 PM
juntao commented 3 months ago

Can you add --model-alias default,embedding to the command and try again? Thanks!

chengr4 commented 3 months ago

Look no difference 😢.

Video link: https://drive.google.com/file/d/1OXLZhQwcyabCpgQ8_YXnLrGRJNl-81-N/view?usp=sharing

chengr4 commented 3 months ago

Retry on 2024.06.15

stuck by running

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-2-7b-chat-hf-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
    rag-api-server.wasm \
    --model-name Llama-2-7b-chat-hf-Q5_K_M,all-MiniLM-L6-v2-ggml-model-f16 \
    --ctx-size 4096,384 \
    --prompt-template llama-2-chat,embedding \
    --rag-prompt "Use the following pieces of context to answer the user's question.\nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n" \
    --log-prompts \
    --log-stat

Get error: "wasi-logging plugin not installed. Please install the plugin and restart WasmEdge."

[2024-06-15 15:07:40.370] [error] wasi-logging plugin not installed. Please install the plugin and restart WasmEdge.
[2024-06-15 15:07:40.371] [error] execution failed: host function failed, Code: 0x40e
[2024-06-15 15:07:40.371] [error]     When executing function name: "_start"

but if I install wasi-logging, wasi_nn will be removed


Version rag: 0.6.6 wasmedge: 0.14.0

juntao commented 3 months ago

You can install both plugins. Just re-run the installer. It will auto install both.

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install_v2.sh | bash -s -- -v 0.13.5
chengr4 commented 3 months ago

Thanks for guidance.

However, I still got garbled text as before. 😢


Version rag: 0.6.6 wasmedge: 0.13.5

apepkuss commented 2 months ago

@chengr4 Please update --prompt-template llama-2-chat to --prompt-template llama-2-chat,embedding.

chengr4 commented 2 months ago

i ran:

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-2-7b-chat-hf-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
    rag-api-server.wasm \
    --model-name Llama-2-7b-chat-hf-Q5_K_M,all-MiniLM-L6-v2-ggml-model-f16 \
    --ctx-size 4096,384 \
    --prompt-template llama-2-chat,embedding \
    --rag-prompt "Use the following pieces of context to answer the user's question.\nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n" \
    --log-prompts \
    --log-stat

However, I still got garbled


Version rag: 0.7.1 wasmedge: 0.13.5