ml-explore / mlx-examples

Examples in the MLX framework
MIT License
6.3k stars 898 forks source link

Web server crashing randomly #1066

Closed amirvenus closed 1 month ago

amirvenus commented 1 month ago

Hi,

When sending POST request to the LLM, especially when using mlx-community/Qwen2.5-32B-Instruct-8bit, the server crashes with the following error:


127.0.0.1 - - [23/Oct/2024 03:15:06] "POST /v1/chat/completions HTTP/1.1" 200 -
2024-10-23 03:15:06,650 - DEBUG - Starting completion:
----------------------------------------
Exception occurred during processing of request from ('127.0.0.1', 61900)
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 349, in process_request
    self.finish_request(request, client_address)
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 362, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/server.py", line 733, in <lambda>
    lambda *args, **kwargs: handler_class(
                            ^^^^^^^^^^^^^^
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/server.py", line 200, in __init__
    super().__init__(*args, **kwargs)
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 761, in __init__
    self.handle()
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/http/server.py", line 436, in handle
    self.handle_one_request()
  File "/opt/homebrew/Cellar/python@3.12/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/http/server.py", line 424, in handle_one_request
    method()
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/server.py", line 296, in do_POST
    method(prompt, stop_id_sequences)
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/server.py", line 504, in handle_completion
    detokenizer.finalize()
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/tokenizer_utils.py", line 219, in finalize
    self.text += self._maybe_trim_space(current_text)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/si/development/mlx/.venv/lib/python3.12/site-packages/mlx_lm/tokenizer_utils.py", line 196, in _maybe_trim_space
    if current_text[0] != " ":
       ~~~~~~~~~~~~^^^
IndexError: string index out of range
----------------------------------------
awni commented 1 month ago

Thanks for flagging. This is fixed on main already and pending release.