vllm server consistently crashes while processing lm-eval requests:
INFO 10-01 09:52:39 engine.py:288] Added request cmpl-270a6c19d13b4fb6aac151b9c8ba44c2-0.
ERROR 10-01 09:52:48 client.py:244] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-01 09:52:48 client.py:244] NoneType: None
INFO 10-01 09:52:48 metrics.py:351] Avg prompt throughput: 2443.7 tokens/s, Avg generation throughput: 4.3 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.
INFO: ::1:39238 - "POST /v1/completions HTTP/1.1" 200 OK
CRITICAL 10-01 09:52:48 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO: ::1:58624 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [2440238]
INFO 10-01 09:52:48 multiproc_worker_utils.py:137] Terminating local vLLM worker processes
(VllmWorkerProcess pid=2440870) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
(VllmWorkerProcess pid=2440872) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
(VllmWorkerProcess pid=2440871) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
/usr/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
/usr/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '```
vllm server consistently crashes while processing lm-eval requests: