fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode

ray-project / llmperf

LLMPerf is a library for validating and benchmarking LLMs

Apache License 2.0

474 stars 71 forks source link

Open llsj14 opened 4 days ago

llsj14 commented 4 days ago

Subsequent requests cannot be sent until whole requests have all finished even in non-block mode.
Fixing the request launcher was challenging due to its dependency on Ray, so I used multiple threads and request launchers, each holding one client and controlling only one request.