XAgentGen: vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly.

YinSonglin1997 commented 7 months ago

Issue Description / 问题描述

按照使用XAgentGen 运行 python run.py，运行到一半报错chatcompletion error: Expecting value: line 1 column 1 (char 0)

Steps to Reproduce / 复现步骤

https://github.com/OpenBMB/XAgent/blob/main/XAgentGen/README.md

Expected Behavior / 预期行为

正常结束返回值

Environment / 环境信息

Operating System / 操作系统：Ubuntu 20.04.6 LTS
Python Version / Python 版本：Python 3.10.0
Other Relevant Information / 其他相关信息：

Error Screenshots or Logs / 错误截图或日志

python run.py 终端错误： 1701821931429

docker run xagentteam/xagentgen:latest 终端错误：

WARNING 12-05 10:08:06 scheduler.py:147] Input prompt (17751 tokens) is too long and exceeds limit of 16384
Exception in callback functools.partial(<function _raise_exception_on_finish at 0x7ff3cbe405e0>, request_tracker=<vllm.engine.async_llm_engine.RequestTracker object at 0x7ff3bdbb28c0>)
handle: <Handle functools.partial(<function _raise_exception_on_finish at 0x7ff3cbe405e0>, request_tracker=<vllm.engine.async_llm_engine.RequestTracker object at 0x7ff3bdbb28c0>)>
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 28, in _raise_exception_on_finish
    task.result()
  File "/opt/conda/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/opt/conda/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 350, in run_engine_loop
    has_requests_in_progress = await self.engine_step()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 329, in engine_step
    request_outputs = await self.engine.step_async()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 186, in step_async
    seq_group_metadata_list, scheduler_outputs, ignored = self._schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 303, in _schedule
    seq_group_metadata_list, scheduler_outputs = self.scheduler.schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/core/scheduler.py", line 273, in schedule
    scheduler_outputs = self._schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/core/scheduler.py", line 189, in _schedule
    num_batched_tokens=len(seq_lens) * max(seq_lens),
ValueError: max() arg is an empty sequence

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 37, in _raise_exception_on_finish
    raise exc
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 32, in _raise_exception_on_finish
    raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for the actual cause.
INFO:     172.21.0.1:48008 - "POST /chat/completions HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 28, in _raise_exception_on_finish
    task.result()
  File "/opt/conda/lib/python3.10/asyncio/futures.py", line 201, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/opt/conda/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 350, in run_engine_loop
    has_requests_in_progress = await self.engine_step()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 329, in engine_step
    request_outputs = await self.engine.step_async()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 186, in step_async
    seq_group_metadata_list, scheduler_outputs, ignored = self._schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 303, in _schedule
    seq_group_metadata_list, scheduler_outputs = self.scheduler.schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/core/scheduler.py", line 273, in schedule
    scheduler_outputs = self._schedule()
  File "/opt/conda/lib/python3.10/site-packages/vllm/core/scheduler.py", line 189, in _schedule
    num_batched_tokens=len(seq_lens) * max(seq_lens),
ValueError: max() arg is an empty sequence

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/opt/conda/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/fastapi/applications.py", line 1106, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/opt/conda/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/opt/conda/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/opt/conda/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/opt/conda/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 274, in app
    raw_response = await run_endpoint_function(
  File "/opt/conda/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/app/app.py", line 125, in chat_function
    async for request_output in results_generator:
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 435, in generate
    raise e
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 429, in generate
    async for request_output in stream:
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 70, in __anext__
    raise result
  File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 37, in _raise_exception_on_finish
    raise exc
  File "/opt/conda/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 32, in _raise_exception_on_finish
    raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for the actual cause.

AL-377 commented 7 months ago

感谢您的提问，是否方便将相关的运行日志文件发到我的邮箱：ljt20233023@163.com，以便我们进一步排查

YinSonglin1997 commented 7 months ago

感谢您的提问，是否方便将相关的运行日志文件发到我的邮箱：ljt20233023@163.com，以便我们进一步排查

已经发送给您

AL-377 commented 7 months ago

感谢您的提问，是否方便将相关的运行日志文件发到我的邮箱：ljt20233023@163.com，以便我们进一步排查

已经发送给您

感谢，已收到🫡

Cppowboy commented 6 months ago

从报错信息上看，似乎您的输入超出了16k长度限制，您可以考虑在配置文件中修改一下与长度有关的配置项。

AL-377 commented 6 months ago

您好，从您的docker run xagentteam/xagentgen:latest 终端第一行：”WARNING 12-05 10:08:06 scheduler.py:147] Input prompt (17751 tokens) is too long and exceeds limit of 16384“，判断出错原因为输入超过上限，您可尝试加大配置文件中的max_tokens字段，详见：https://github.com/OpenBMB/XAgent/blob/main/assets/xagentllama.yml

YinSonglin1997 commented 6 months ago

您好，从您的docker run xagentteam/xagentgen:latest 终端第一行：”WARNING 12-05 10:08:06 scheduler.py:147] Input prompt (17751 tokens) is too long and exceeds limit of 16384“，判断出错原因为输入超过上限，您可尝试加大配置文件中的max_tokens字段，详见：https://github.com/OpenBMB/XAgent/blob/main/assets/xagentllama.yml

好的，对于您说的加大配置文件中的max_tokens字段，是任意加大吗？还是说有什么限制条件？

AL-377 commented 6 months ago

您好，目前没有什么限制，建议加大为32768

YinSonglin1997 commented 6 months ago

您好，目前没有什么限制，建议加大为32768

您好，我通过修改assets/xagentllama.yml中的max_tokens为32768，运行时仍会报这个错Input prompt (16479 tokens) is too long and exceeds limit of 16384，可是我已经修改了参数，请问这是什么原因呢？

AL-377 commented 6 months ago

您好，我通过修改assets/xagentllama.yml中的max_tokens为32768，运行时仍会报这个错Input prompt (16479 tokens) is too long and exceeds limit of 16384，可是我已经修改了参数，请问这是什么原因呢？

不好意思，之前的回复有误。因为xagentllama用的基座模型是codellama，最长上下文支持16384，理论上并不支持更大的上下文长度。若需要强行扩大上下文长度，需要修改以下部分：

assets/xagentllama.yml中的max_tokens设置为32768
xagentllama的模型文件夹下config.json中max_position_embeddings修改为32768
修改app.py中的55-56行，将max_num_batched_tokens和max_model_len都修改为32768

提示：由于codellama本身并不支持超过16384的上下文，采取以上的方式可能会降低模型的性能，建议等待后续团队release其他模型或者修改模型推理方式以支持更长的上下文。

OpenBMB / XAgent