Closed ayttop closed 2 months ago
how to use server with curl?
curl -X 'POST' \
'http://localhost:8000/v1/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"prompt": "Tell me a story",
"temperature": 1,
"max_new_tokens": 128,
"top_k": 50,
"top_p": 1,
"stop_words": [
"string"
]
}'
(2) C:\Users\ArabTech\Desktop\2>curl -X POST "http://localhost:8000/v1/completions" ^ More? -H "accept: application/json" ^ More? -H "Content-Type: application/json" ^ More? -d "{\"prompt\": \"Tell me a story\", \"temperature\": 1, \"max_new_tokens\": 128, \"top_k\": 50, \"top_p\": 1, \"stop_words\": [\"string\"]}" {"detail":"'text'"} (2) C:\Users\ArabTech\Desktop\2>
(2) C:\Users\ArabTech\Desktop\2>curl -X POST "http://localhost:8000/v1/completions" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\": \"Tell me a story\", \"temperature\": 1, \"max_new_tokens\": 128, \"top_k\": 50, \"top_p\": 1, \"stop_words\": [\"string\"]}" {"detail":"'text'"}
not run
win11 anaconda python 3.11
(2) C:\Users\ArabTech\Desktop\2>curl -X POST "http://localhost:8000/v1/completions" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"text\": \"Tell me a story\", \"temperature\": 1, \"max_new_tokens\": 128, \"top_k\": 50, \"top_p\": 1, \"stop_words\": [\"string\"]}" {"detail":"'text'"} (2) C:\Users\ArabTech\Desktop\2>curl -X POST "http://localhost:8000/v1/completions" -H "Content-Type: application/json" -d "{\"text\": \"Tell me a story\"}" {"detail":"'text'"} (2) C:\Users\ArabTech\Desktop\2>
(2) C:\Users\ArabTech\Desktop\2\nexa-sdk>nexa server --host 127.0.0.1 --port 8000 Phi-2:q4_0 INFO: Started server process [5432] INFO: Waiting for application startup. 2024-09-09 17:43:43,550 - INFO - Model Path: Phi-2:q4_0 Model Phi-2:q4_0 already exists at C:\Users\ArabTech.cache\nexa\hub\official\Phi-2\q4_0.gguf 2024-09-09 17:43:44,213 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x0000015396476210> INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:57924 - "GET /docs HTTP/1.1" 200 OK INFO: 127.0.0.1:57924 - "GET /openapi.json HTTP/1.1" 200 OK 2024-09-09 17:44:28,514 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:57925 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 17:45:18,081 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:57948 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 17:45:35,981 - ERROR - Error in chat completions: 'text' INFO: 127.0.0.1:57954 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
{"detail":"'text'"}
{ "detail": "'text'" }
![Uploading Screenshot 2024-09-09 182036.png…]()
posrtman
(2) C:\Users\ArabTech\Desktop\2>nexa server --host 127.0.0.1 --port 8000 Phi-2:q4_0 INFO: Started server process [25424] INFO: Waiting for application startup. 2024-09-09 18:13:58,307 - INFO - Model Path: Phi-2:q4_0 Model Phi-2:q4_0 already exists at C:\Users\ArabTech.cache\nexa\hub\official\Phi-2\q4_0.gguf 2024-09-09 18:13:58,945 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x0000022FB3901A90> INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:59196 - "GET /docs HTTP/1.1" 200 OK INFO: 127.0.0.1:59196 - "GET /openapi.json HTTP/1.1" 200 OK 2024-09-09 18:14:24,686 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:59197 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 18:20:09,358 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:59270 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
(2) C:\Users\ArabTech\Desktop\2>nexa run gemma-2b:q2_K Downloading gemma-2b/q2_K.gguf... q2_K.gguf: 0%|▏ | 3.53M/1.32G [00:01<10:29, 2.24MiB/s] An error occurred while downloading or processing the model: [SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2580) Failed to pull model gemma-2b:q2_K Unknown task: UNKNOWN. Skipping inference.
(2) C:\Users\ArabTech\Desktop\2>
(2) C:\Users\ArabTech\Desktop\2>nexa run gemma-2b:q4_K_M Model gemma-2b:q4_K_M already exists at C:\Users\ArabTech.cache\nexa\hub\official\gemma-2b\q4_K_M.gguf
k 2024-09-09 18:48:02,962 - ERROR - Error during generation: 'text' Traceback (most recent call last): File "C:\Users\ArabTech\Desktop\2\nexa-sdk\nexa\gguf\nexa_inference_text.py", line 173, in run delta = chunk["choices"][0]["text"]
KeyError: 'text' Send
(2) C:\Users\ArabTech\Desktop\2>nexa server --host 127.0.0.1 --port 8000 Meta-Llama-3-8B-Instruct:q2_K INFO: Started server process [4512] INFO: Waiting for application startup. 2024-09-09 19:11:50,509 - INFO - Model Path: Meta-Llama-3-8B-Instruct:q2_K Model Meta-Llama-3-8B-Instruct:q2_K already exists at C:\Users\ArabTech.cache\nexa\hub\official\Meta-Llama-3-8B-Instruct\q2_K.gguf 2024-09-09 19:11:50,931 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x000002158396A450> INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:51991 - "GET /docs HTTP/1.1" 200 OK INFO: 127.0.0.1:51991 - "GET /openapi.json HTTP/1.1" 200 OK 2024-09-09 19:12:25,168 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:51992 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:12:39,967 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52001 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:13:04,016 - ERROR - Error in chat completions: 'text' INFO: 127.0.0.1:52008 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
{ "detail": "'text'" }
(2) C:\Users\ArabTech\Desktop\2> nexa server Meta-Llama-3-8B-Instruct:q2_K INFO: Started server process [12260] INFO: Waiting for application startup. 2024-09-09 19:18:20,635 - INFO - Model Path: Meta-Llama-3-8B-Instruct:q2_K Model Meta-Llama-3-8B-Instruct:q2_K already exists at C:\Users\ArabTech.cache\nexa\hub\official\Meta-Llama-3-8B-Instruct\q2_K.gguf 2024-09-09 19:18:21,057 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x000001C6EBD23950> INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:52197 - "GET /docs HTTP/1.1" 200 OK INFO: 127.0.0.1:52197 - "GET /openapi.json HTTP/1.1" 200 OK 2024-09-09 19:19:04,015 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52198 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:19:50,832 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52208 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:19:59,782 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52210 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:20:35,015 - ERROR - Error in chat completions: 'text' INFO: 127.0.0.1:52211 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:21:15,760 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52297 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:21:33,361 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52300 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error 2024-09-09 19:22:24,176 - ERROR - Error in text generation: 'text' INFO: 127.0.0.1:52304 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
I tried on other models and tried on a website http://127.0.0.1:8000/docs#/default/chat_completions_v1_chat_completions_post and a program postman and all attempts failed.
No parameters
curl -X 'POST' \
'http://127.0.0.1:8000/v1/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"prompt": "Tell me a story",
"temperature": 1,
"max_new_tokens": 128,
"top_k": 50,
"top_p": 1,
"stop_words": [
"string"
]
}'
http://127.0.0.1:8000/v1/completions
it is run itis work
(2) C:\Users\ArabTech\Desktop\2\nexa-sdk>nexa server gemma INFO: Started server process [16584] INFO: Waiting for application startup. 2024-09-10 10:42:17,166 - INFO - Model Path: gemma Downloading gemma-1.1-2b-instruct/q4_0.gguf... q4_0.gguf: 100%|█████████████████████████████████████████████████████████████████| 1.44G/1.44G [07:24<00:00, 3.49MiB/s] Successfully downloaded gemma-1.1-2b-instruct/q4_0.gguf to C:\Users\ArabTech.cache\nexa\hub\official\gemma-1.1-2b-instruct\q4_0.gguf Successfully pulled model gemma-1.1-2b-instruct:q4_0 to C:\Users\ArabTech.cache\nexa\hub\official\gemma-1.1-2b-instruct\q4_0.gguf, run_type: NLP 2024-09-10 10:49:46,544 - INFO - model loaded as <nexa.gguf.llama.llama.Llama object at 0x000001B772B23210> INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:53721 - "POST /v1/completions HTTP/1.1" 200 OK INFO: 127.0.0.1:53806 - "POST /v1/completions HTTP/1.1" 200 OK
postman
post http://0.0.0.0:8000/v1/completions body row json { "prompt": "Tell me a story", "temperature": 1, "max_new_tokens": 128, "top_k": 50, "top_p": 1, "stop_words": ["string"] }
send
prety { "result": "In a hidden valley nestled between towering mountains, there lived a brave princess named Anya. Her kingdom was under a cruel spell, and the once-vibrant land was now shrouded in an eerie gloom.\n\nOne fateful night, Anya heard a haunting melody carried on the wind. Curious, she followed the sound to a secluded grove. There, she found an ancient oak tree that whispered secrets in a language unknown to human ears.\n\nAs Anya listened intently, she discovered a tale of a forgotten curse and a magical artifact that could break the spell. Guided by the tree's wisdom, Anya embarked on a perilous quest through treacherous forests and sparkling" }
thank you
Briannotifications@github.com NexaAI/nexa-sdk thank you
{ "prompt": "Tell me a story", "temperature": 1, "max_new_tokens": 128, "top_k": 50, "top_p": 1, "stop_words": [ "string" ] }
http://0.0.0.0:8000/v1/completions
{ "prompt": "Tell me a story", "temperature": 1, "max_new_tokens": 128, "top_k": 50, "top_p": 1, "stop_words": [ "string" ] }
{ "result": "The wind whispered secrets through the ancient oak tree, rustling its leaves and sending shivers down its wrinkled bark. Beneath its watchful gaze, a young girl sat nestled in a woven mat, her eyes closed, her mind lost in a world of her own creation.\n\nHer story began with a tear, a silent spill on the fabric of her world. The world she built was a canvas of vibrant colors, filled with talking animals, sparkling rivers, and towering mountains. With each breath, her story grew, expanding beyond the boundaries of the mat.\n\nThe wind became her pen, guiding her words across the page. She spun tales of courage" }
thank you but how run curl -X 'POST' \ 'http://localhost:8000/v1/completions' \ -H 'accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "prompt": "Tell me a story", "temperature": 1, "max_new_tokens": 128, "top_k": 50, "top_p": 1, "stop_words": [ "string" ] }'
on cmd win11
on postman
http://0.0.0.0:8000/v1/completions
{ "prompt": "Tell me a story", "temperature": 1, "max_new_tokens": 128, "top_k": 50, "top_p": 1, "stop_words": [ "string" ] }
{ "result": "The wind whispered secrets through the ancient oak tree, rustling its leaves and sending shivers down its wrinkled bark. Beneath its watchful gaze, a young girl sat nestled in a woven mat, her eyes closed, her mind lost in a world of her own creation.\n\nHer story began with a tear, a silent spill on the fabric of her world. The world she built was a canvas of vibrant colors, filled with talking animals, sparkling rivers, and towering mountains. With each breath, her story grew, expanding beyond the boundaries of the mat.\n\nThe wind became her pen, guiding her words across the page. She spun tales of courage" }
Hi @ayttop, Here is how to execute curl
command on Windows:
Step 1. Open Command Prompt:
Win + R
, type cmd
, and press Enter.Win + X
and select Command Prompt or Windows Terminal.Step 2. Run the curl command: Copy and paste the following command into the Command Prompt. For Windows, ensure that the JSON data in the -d
flag is properly formatted:
curl -X POST "http://localhost:8000/v1/completions" ^
-H "accept: application/json" ^
-H "Content-Type: application/json" ^
-d "{ \"prompt\": \"Tell me a story\", \"temperature\": 1, \"max_new_tokens\": 128, \"top_k\": 50, \"top_p\": 1, \"stop_words\": [ \"string\" ] }"
^
is used for multi-line commands in Windowscurl -X POST "http://localhost:8000/v1/completions" -H "accept: application/json" -H "Content-Type: application/json" -d "{ \"prompt\": \"Tell me a story\", \"temperature\": 1, \"max_new_tokens\": 128, \"top_k\": 50, \"top_p\": 1, \"stop_words\": [ \"string\" ] }"
how to use server?