I ran into this issue when testing a locally-generated model based on Qwen.
----------------------------------------
Exception occurred during processing of request from ('[IP REDACTED]', 32822)
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock
self.process_request(request, client_address)
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 349, in process_request
self.finish_request(request, client_address)
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 362, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/server.py", line 758, in <lambda>
lambda *args, **kwargs: handler_class(
^^^^^^^^^^^^^^
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/server.py", line 200, in __init__
super().__init__(*args, **kwargs)
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/socketserver.py", line 761, in __init__
self.handle()
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/http/server.py", line 436, in handle
self.handle_one_request()
File "/opt/homebrew/Cellar/python@3.12/3.12.7_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/http/server.py", line 424, in handle_one_request
method()
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/server.py", line 313, in do_POST
method(prompt, stop_id_sequences)
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/server.py", line 594, in handle_stream
detokenizer.add_token(token)
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/tokenizer_utils.py", line 210, in add_token
self.text += self._maybe_trim_space(current_text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/isaac/venv/lib/python3.12/site-packages/mlx_lm/tokenizer_utils.py", line 196, in _maybe_trim_space
if current_text[0] != " ":
~~~~~~~~~~~~^^^
IndexError: string index out of range
----------------------------------------
This covers a gap in the logic and should help future-proof the function's use.
I ran into this issue when testing a locally-generated model based on Qwen.
This covers a gap in the logic and should help future-proof the function's use.