LlamaEdge / rag-api-server

A RAG API server written in Rust following OpenAI specs
https://llamaedge.com/docs/user-guide/server-side-rag/quick-start
Apache License 2.0
21 stars 7 forks source link

error: instantiation failed: unknown import, Code: 0x62 #6

Closed katopz closed 4 months ago

katopz commented 4 months ago

run

wasmedge --dir .:. --nn-preload default:GGML:AUTO:Llama-2-7b-chat-hf-Q5_K_M.gguf \
    --nn-preload embedding:GGML:AUTO:all-MiniLM-L6-v2-ggml-model-f16.gguf \
    rag-api-server.wasm \
    --model-name Llama-2-7b-chat-hf-Q5_K_M,all-MiniLM-L6-v2-ggml-model-f16 \
    --ctx-size 4096,384 \
    --prompt-template llama-2-chat \
    --rag-prompt "Use the following pieces of context to answer the user's question.\nIf you don't know the answer, just say that you don't know, don't try to make up an answer.\n----------------\n" \
    --log-prompts \
    --log-stat

got

[2024-05-14 18:24:13.534] [error] instantiation failed: unknown import, Code: 0x62
[2024-05-14 18:24:13.534] [error]     When linking module: "rustls_client" , function name: "new_codec"
[2024-05-14 18:24:13.534] [error]     At AST node: import description
[2024-05-14 18:24:13.534] [error]     At AST node: import section
[2024-05-14 18:24:13.534] [error]     At AST node: module

Not sure what I miss here 🤔

apepkuss commented 4 months ago

Seems the rustls plugin is not required. Please try the following command to reinstall wasmedge with ggml + rustls plugins:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- -v 0.13.5 --plugins wasi_nn-ggml wasmedge_rustls
ChloeWKY commented 4 months ago

I reinstalled wasmedge with ggml + rustls plugins, rebuilt rag-api-server.wasm, and ran

wasmedge rag-api-server.wasm -h,

but got

[2024-05-14 23:07:43.871] [error] instantiation failed: module name conflict, Code: 0x60
[2024-05-14 23:07:43.871] [error]     At AST node: module
LlamaEdge-RAG API Server

Usage: rag-api-server.wasm [OPTIONS] --model-name <MODEL_NAME> --prompt-template <PROMPT_TEMPLATE>

Options:
  -m, --model-name <MODEL_NAME>
          Sets names for chat and embedding models. The names are separated by comma without space, for example, '--model-name Llama-2-7b,all-minilm'
  -a, --model-alias <MODEL_ALIAS>
          Model aliases for chat and embedding models [default: default,embedding]
  -c, --ctx-size <CTX_SIZE>
          Sets context sizes for chat and embedding models. The sizes are separated by comma without space, for example, '--ctx-size 4096,384'. The first value is for the chat model, and the second is for the embedding model [default: 4096,384]
  -p, --prompt-template <PROMPT_TEMPLATE>
          Prompt template [possible values: llama-2-chat, llama-3-chat, mistral-instruct, mistrallite, openchat, codellama-instruct, codellama-super-instruct, human-assistant, vicuna-1.0-chat, vicuna-1.1-chat, vicuna-llava, chatml, baichuan-2, wizard-coder, zephyr, stablelm-zephyr, intel-neural, deepseek-chat, deepseek-coder, solar-instruct, phi-2-chat, phi-2-instruct, phi-3-chat, phi-3-instruct, gemma-instruct, octopus]
  -r, --reverse-prompt <REVERSE_PROMPT>
          Halt generation at PROMPT, return control
  -b, --batch-size <BATCH_SIZE>
          Batch size for prompt processing [default: 512]
      --rag-prompt <RAG_PROMPT>
          Custom rag prompt
      --rag-policy <POLICY>
          Strategy for merging RAG context into chat messages [default: system-message] [possible values: system-message, last-user-message]
      --qdrant-url <QDRANT_URL>
          URL of Qdrant REST Service [default: http://localhost:6333]
      --qdrant-collection-name <QDRANT_COLLECTION_NAME>
          Name of Qdrant collection [default: default]
      --qdrant-limit <QDRANT_LIMIT>
          Max number of retrieved result (no less than 1) [default: 5]
      --qdrant-score-threshold <QDRANT_SCORE_THRESHOLD>
          Minimal score threshold for the search result [default: 0.4]
      --chunk-capacity <CHUNK_CAPACITY>
          Maximum number of tokens each chunk contains [default: 100]
      --log-prompts
          Print prompt strings to stdout
      --log-stat
          Print statistics to stdout
      --log-all
          Print all log information to stdout
      --socket-addr <SOCKET_ADDR>
          Socket address of LlamaEdge API Server instance [default: 0.0.0.0:8080]
      --web-ui <WEB_UI>
          Root path for the Web UI files [default: chatbot-ui]
  -h, --help
          Print help (see more with '--help')
  -V, --version
          Print version

I tried to rename mod backend, error, utils, ggml with their corresponding file names, then rebuilt the program, but unfortunately that did not solve the problem.

apepkuss commented 4 months ago

@ChloeWKY The error message at the beginning can be ignored. The coming WasmEdge v0.14.0 will fix it.

katopz commented 4 months ago

Seems the rustls plugin is not required. Please try the following command to reinstall wasmedge with ggml + rustls plugins:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | bash -s -- -v 0.13.5 --plugins wasi_nn-ggml wasmedge_rustls

Thanks, working now.

katopz commented 4 months ago

@ChloeWKY The error message at the beginning can be ignored. The coming WasmEdge v0.14.0 will fix it.

So this is related? https://github.com/LlamaEdge/Example-LlamaEdge-RAG/issues/1 Seem like an example is also outdated?

apepkuss commented 4 months ago

Yeah. The example repo was created at the very early stage of rag-api-server project. After that, a lot of changes were introduced into rag-api-server. However, the example repo failed to keep the same pace with the project. We'll catch up asap.

katopz commented 4 months ago

Yeah. The example repo was created at the very early stage of rag-api-server project. After that, a lot of changes were introduced into rag-api-server. However, the example repo failed to keep the same pace with the project. We'll catch up asap.

Many thanks! Hope this will sort out soon. I will talk about LlamaEdge RAG at AWS Thailand events 11th next month (+30 dev). No pressure 😅 just really want to make WasmEdge look pretty at that time. 🤗