[Bug] [Module Name] Bug title LIMIT_MODEL_CONCURRENCY Setup problem

Storm0921 commented 2 months ago

Search before asking

[X] I had searched in the issues and found no similar issues.

Operating system information

Linux

Python version information

=3.11

DB-GPT version

main

Related scenes

[ ] Chat Data
[ ] Chat Excel
[ ] Chat DB
[ ] Chat Knowledge
[ ] Model Management
[ ] Dashboard
[ ] Plugins

Installation Information

Device information

A100

Models information

Qwen1.5-72B-chat-Int4

What happened

LIMIT_MODEL_CONCURRENCYWhat are the restrictions on this parameter in env? The parameter settings are not very useful. I passed the official vLLM benchmark test and set them to 5 and 10500 respectively, and the results were the same

What you expected to happen

Could you please explain the meaning of this parameter

How to reproduce

yes

Additional context

No response

Are you willing to submit PR?

[ ] Yes I am willing to submit a PR!

fangyinc commented 2 months ago

Hi, @Storm0921. How did you run the tests, and how did you set up DB-GPT? The number of simultaneous requests each Model Worker can handle with "LIMIT_MODEL_CONCURRENCY"

Storm0921 commented 2 months ago

Hi, @Storm0921. How did you run the tests, and how did you set up DB-GPT? The number of simultaneous requests each Model Worker can handle with "LIMIT_MODEL_CONCURRENCY"

Hello, I use proxyllm, starting a VLLM Model myself, and then visited through proxy. I found that limit_model_concurrency cannot control this concurrent concurrency.

Storm0921 commented 1 month ago

@fangyinc hello，Is it not possible to control concurrency through parameters limit_model_concurrency when accessing large models through proxies? I would like to confirm this point.

github-actions[bot] commented 1 week ago

This issue has been marked as stale, because it has been over 30 days without any activity.

github-actions[bot] commented 5 days ago

This issue bas been closed, because it has been marked as stale and there has been no activity for over 7 days.

eosphoros-ai / DB-GPT