logikon-ai / cot-eval

A framework for evaluating the effectiveness of chain-of-thought reasoning in language models.
https://huggingface.co/spaces/logikon/open_cot_leaderboard
MIT License
9 stars 1 forks source link

mistralai/Mistral-Nemo-Instruct-2407 vllm-server crashes #62

Open ggbetz opened 3 hours ago

ggbetz commented 3 hours ago

vllm server consistently crashes while processing lm-eval requests:


INFO 10-01 09:52:39 engine.py:288] Added request cmpl-270a6c19d13b4fb6aac151b9c8ba44c2-0.
ERROR 10-01 09:52:48 client.py:244] TimeoutError('No heartbeat received from MQLLMEngine')
ERROR 10-01 09:52:48 client.py:244] NoneType: None
INFO 10-01 09:52:48 metrics.py:351] Avg prompt throughput: 2443.7 tokens/s, Avg generation throughput: 4.3 tokens/s, Running: 0 reqs, Swapped: 0 reqs, Pending: 0 reqs, GPU KV cache usage: 0.0%, CPU KV cache usage: 0.0%.
INFO:     ::1:39238 - "POST /v1/completions HTTP/1.1" 200 OK
CRITICAL 10-01 09:52:48 launcher.py:99] MQLLMEngine is already dead, terminating server process
INFO:     ::1:58624 - "POST /v1/completions HTTP/1.1" 500 Internal Server Error
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [2440238]
INFO 10-01 09:52:48 multiproc_worker_utils.py:137] Terminating local vLLM worker processes
(VllmWorkerProcess pid=2440870) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
(VllmWorkerProcess pid=2440872) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
(VllmWorkerProcess pid=2440871) INFO 10-01 09:52:48 multiproc_worker_utils.py:244] Worker exiting
/usr/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
/usr/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '```
ggbetz commented 3 hours ago

Related? https://github.com/vllm-project/vllm/issues/7532