lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache License 2.0
36.63k stars 4.52k forks source link

Bug in streaming mode -> UnboundLocalError: local variable 'stopped' referenced before assignment #2600

Closed npuichigo closed 10 months ago

npuichigo commented 11 months ago

I think UnboundLocalError: local variable 'stopped' referenced before assignment is a bug in code

ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr | ERROR:    Exception in ASGI application
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr | Traceback (most recent call last):
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     return await self.app(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/fastapi/applications.py", line 1115, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await super().__call__(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/applications.py", line 122, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 184, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     raise exc
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/errors.py", line 162, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await self.app(scope, receive, _send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 79, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     raise exc
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/middleware/exceptions.py", line 68, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await self.app(scope, receive, sender)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     raise e
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await self.app(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 718, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await route.handle(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 276, in handle
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await self.app(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 69, in app
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await response(scope, receive, send)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 270, in __call__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     async with anyio.create_task_group() as task_group:
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     raise exceptions[0]
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 273, in wrap
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     await func()
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 262, in stream_response
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     async for chunk in self.body_iterator:
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/concurrency.py", line 63, in iterate_in_threadpool
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     yield await anyio.to_thread.run_sync(_next, iterator)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     return await get_asynclib().run_sync_in_worker_thread(
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     return await future
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     result = context.run(func, *args)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/starlette/concurrency.py", line 53, in _next
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     return next(iterator)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/fastchat/serve/model_worker.py", line 104, in generate_stream_gate
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     for output in self.generate_stream_func(
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 35, in generator_context
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     response = gen.send(None)
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/dist-packages/fastchat/serve/inference.py", line 249, in generate_stream
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr |     if stopped:
ibot-model-worker-1  | 2023-10-25 02:02:09 | ERROR | stderr | UnboundLocalError: local variable 'stopped' referenced before assignment
bsabri commented 11 months ago

We have the same issue. It seems probably related to exceeding context maximum length

infwinston commented 11 months ago

could you provide steps to reproduce this issue? this will be helpful for debugging.

npuichigo commented 11 months ago

We have the same issue. It seems probably related to exceeding context maximum length

This error occurs when I use LangChain + VectorStore to provide much retrieve context for LLM generation. So, it seems the LLM context is too long, but the error is not obvious and just report a python syntax issue.

npuichigo commented 11 months ago

@merrymercy

zchuz commented 11 months ago

same issue. I encounter this with llama2-chat-7b. The input length + max_new_tokens is around 2500 tokens, less than the 4096. Also, I was able to run it successfully with Vicuna 1.5-7b. They are both based on the llama2.

zchuz commented 11 months ago

I may have found the cause of the error, which is caused by max_position_embeddings in config.json. For my problem, it was the config.json that was causing it. The config downloaded from the LLAMA2 repository has max_position_embeddings = 2048 (which if for llama1). When I change it to 4096, it works. @npuichigo

npuichigo commented 11 months ago

@zchuz Thanks for your info. I also think it's caused by the max length restriction of model. But FastChat should yield better error response and this line is actually a python bug UnboundLocalError: local variable 'stopped' referenced before assignment @merrymercy

chanchimin commented 10 months ago

i encountered same issue and i am using fastchat-t5-3b, but do not find where to change the max_position_embeddings in config