Update english prompt to 34k in vllm_online_benchmark.py

intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

Apache License 2.0

6.73k stars 1.27k forks source link

Description

1. Why the change?

2. User API changes

3. Summary of the change

4. How to test?

[ ] N/A
[ ] Unit test: Please manually trigger the PR Validation here by inputting the PR number (e.g., 1234). And paste your action link here once it has been successfully finished.
[ ] Application test
[ ] Document test
[ ] ...

5. New dependencies

[ ] New Python dependencies
- Dependency1
- Dependency2
- ...
[ ] New Java/Scala dependencies and their license
- Dependency1 and license1
- Dependency2 and license2
- ...