NexaAI / nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.
https://docs.nexa.ai/
Apache License 2.0
2.88k stars 418 forks source link

about nexa server #61

Closed zhb-code closed 1 month ago

zhb-code commented 2 months ago

I want to experience the nexa server function. Model Qwen2-7B-Instruct:q5_K_M has been downloaded to the local PC using nexa run Qwen2-7B-Instruct:q5_K_M. However, when executing nexa server --host 0.0.0.0 --port 8000 Qwen2-7B-Instruct:q5_K_M, the following error message is displayed: Model Qwen2-7B-Instruct:q5_K_M not found in NEXA_RUN_MODEL_MAP, I want to know what to do?

zhiyuan8 commented 2 months ago

Thanks for bringing this up, we will fix it in next release, in 1-2 days.

zhiyuan8 commented 2 months ago

@zhb-code Our v0.0.8.1 is out: https://nexaai.github.io/nexa-sdk/whl/

pip uninstall nexaai

Then please follow the doc https://docs.nexaai.com/getting-started/installation to install

zhiyuan8 commented 2 months ago

image

zhb-code commented 2 months ago

First of all, thank you for your reply, I successfully started the service according to the method you provided, and then chatted according to your api, but the reply I got was seriously hallucinational and the answer was not Instruct:q5_K_M, I directly ran nexa run Qwen2-7B-Instruct: q5_k_m, Then ask, "hello" "Hello! How can I help you today? If you have any questions or need information on a particular topic, feel free to ask.", which is a more normal answer, but the way the service and api are accessed is:,.\n\nI've been trying to prove this for a while but I can't seem to get it.\n\nProve that $\tan\frac{\alpha}{2}=\frac{\sin\alpha}{1+\cos\alpha}$\n\nI've tried to do it by drawing a triangle and using the definitions of the three functions. However, I can't seem to get it. I've also tried doing it by using the definitions of the functions with the half angle of alpha, but that didn't work.\n\nI'd appreciate any help on this. I'd prefer hints rather than a complete answer, but that's up to you. I don't even know what it's saying. What makes the answers so different?

zhb-code commented 2 months ago

111 222 All models are Qwen2-7B-Instruct/q5_K_M.gguf

JoyboyBrian commented 2 months ago

Hi @zhb-code, thank you for this insight. We're addressing issues across the Qwen series. For now, we recommend using gemma-1.1-2b-instruct:q4_0. We appreciate your feedback and will update on progress.

zhb-code commented 1 month ago

@JoyboyBrian Thank you for your reply and look forward to your update to solve the problem.

ayttop commented 1 month ago

(2) C:\Users\ArabTech\Desktop\2>curl -X POST http://127.0.0.1:8000/v1/completions -H "Content-Type: application/json" -d "{\"prompt\":\"Tell me a story\"}" {"detail":"'text'"}

ayttop commented 1 month ago

How do I turn on the server? I tried many commands and they do not work.

ayttop commented 1 month ago

(2) C:\Users\ArabTech\Desktop\2\nexa-sdk>nexa server Octopus-v2:q2_K INFO: Started server process [24352] INFO: Waiting for application startup. 2024-09-09 17:10:12,089 - INFO - Model Path: Octopus-v2:q2_K Model Octopus-v2:q2_K already exists at C:\Users\ArabTech.cache\nexa\hub\official\Octopus-v2\q2_K.gguf 2024-09-09 17:10:12,371 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x00000259F4762390> INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) 2024-09-09 17:14:07,788 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:57049 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 17:14:32,989 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:57058 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error

(2) C:\Users\ArabTech\Desktop\2\nexa-sdk>curl -X POST http://localhost:8000/v1/completions -H "Content-Type: application/json" -d "{\"prompt\": \"Hello, world!\"}" {"detail":"'text'"} (2) C:\Users\ArabTech\Desktop\2\nexa-sdk>curl -X POST http://localhost:8000/v1/completions -H "Content-Type: application/json" -d "{\"text\": \"Hello, world!\"}" {"detail":"'text'"} (2) C:\Users\ArabTech\Desktop\2\nexa-sdk>

ayttop commented 1 month ago

how to use server?