neuralmagic / guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
Apache License 2.0
158 stars 11 forks source link

how to call the remote api #62

Open jackqdldd opened 1 month ago

jackqdldd commented 1 month ago

The remote server directory: 7a3f1c5610cc98b06ecae5ebdad460c8

The request url is like: http://10.10.10.10:40105/v1/chat/completions

Then I user the command:guidellm \ --target "http://10.10.10.10:40035/v1" \ --model "MiniCPM3-4B" \ --data-type emulated \ --data "prompt_tokens=512,generated_tokens=128" \ --rate-type sweep --rate 2 --max-requests 2

I got the error: Max retries exceeded with url: /MiniCPM3-4B/resolve/main/config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f1e0daaad70>, 'Connection to huggingface.co timed out. (connect timeout=10)'