ray-project / llmperf

LLMPerf is a library for validating and benchmarking LLMs
Apache License 2.0
471 stars 69 forks source link

Bug: Hugging Face TGI not working #33

Open ptrmayer opened 4 months ago

ptrmayer commented 4 months ago

When trying to load test LLM deployed using hugging face TGI v1.4, using following commands:

export OPENAI_API_BASE=""
export OPENAI_API_KEY="test"
python3.9 token_benchmark_ray.py \
--model "mistralai/Mistral-7B-Instruct-v0.2" \
--mean-input-tokens 550 \
--stddev-input-tokens 150 \
--mean-output-tokens 150 \
--stddev-output-tokens 10 \
--max-num-completed-requests 100 \
--timeout 600 \
--num-concurrent-requests 5 \
--results-dir "result_outputs" \
--llm-api openai \
--additional-sampling-params '{}'

following error occurs:

(OpenAIChatCompletionsClient pid=82698) Warning Or Error: 422 Client Error: Unprocessable Entity for url: 
(OpenAIChatCompletionsClient pid=82698) 422

I could fix this error by replacing line 79 in openai_chat_completions_client.py with stem = "data:"