run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.09k stars 5.14k forks source link

[Bug]: My inner workflow faces a timeout error #16549

Open tituslhy opened 4 hours ago

tituslhy commented 4 hours ago

Bug Description

Despite specifying a generous timeout limit for all my workflows, my inner workflow of my nested workflow always times out within 10 seconds once deployed on LlamaDeploy.

  1. This error did not exist on Jupyter Notebooks
  2. When I pinged the inner workflow directly using LlamaDeployClient I get no timeout error

Which means that the timeout error only happens when I ping the outer workflow which then routes to the inner workflow.

Version

0.11.17

Steps to Reproduce

Deployment of core

from llama_deploy import (
    deploy_core,
    ControlPlaneConfig,
)
from llama_deploy.message_queues.simple import (
    SimpleMessageQueueConfig
)
control_plane_config = ControlPlaneConfig()
message_queue_config = SimpleMessageQueueConfig()

if __name__ == "__main__":
    import asyncio

    asyncio.run(
        deploy_core(
            control_plane_config = control_plane_config,
            message_queue_config=message_queue_config,
            disable_control_plane=False,
            disable_message_queue=False,
        )
    )

Deployment of workflow

from llama_index.core.workflow import Workflow, StartEvent, StopEvent, step
import asyncio
from llama_deploy import (
    WorkflowServiceConfig,
    ControlPlaneConfig,
    deploy_workflow,
)

class InnerWorkflow(Workflow):
    @step()
    async def run_step(self, ev: StartEvent) -> StopEvent:
        result = ... #some long step here
        return StopEvent(result=str(result))

class OuterWorkflow(Workflow):
    @step()
    async def run_step(
        self, ev: StartEvent, inner: InnerWorkflow
    ) -> StopEvent:
        arg1 = ev.get("arg1")
        result = await inner.run(arg1=arg1)

        return StopEvent(result=str(result))

inner = InnerWorkflow(timeout=1200.0,verbose=True)
outer = OuterWorkflow(timeout=6000.0, verbose = True)
outer.add_workflows(inner=inner)

async def main():
    inner_task = asyncio.create_task(
        deploy_workflow(
            inner,
            WorkflowServiceConfig(
                host="127.0.0.1", port=8003, service_name="inner"
            ),
            ControlPlaneConfig(),
        )
    )

    outer_task = asyncio.create_task(
        deploy_workflow(
            outer,
            WorkflowServiceConfig(
                host="127.0.0.1", port=8002, service_name="outer"
            ),
            ControlPlaneConfig(),
        )
    )

    await asyncio.gather(inner_task, outer_task)

if name == "main":
    import asyncio
    asyncio.run(main())

Calling of workflow

from llama_deploy import AsyncLlamaDeployClient, ControlPlaneConfig
control_plane_config = ControlPlaneConfig()
client = AsyncLlamaDeployClient(control_plane_config)
session = await client.create_session()
result = await session.run(
   "outer"
   query = ...
)

Relevant Logs/Tracbacks

Traceback (most recent call last):
  File "/app/tlim2/anaconda3/envs/llamaindex/lib/python3.12/site-packages/llama_index/core/instrumentation/dispatcher.py", line 281, in handle_future_result
    result = future.result()
             ^^^^^^^^^^^^^^^
  File "/app/tlim2/anaconda3/envs/llamaindex/lib/python3.12/site-packages/llama_index/core/workflow/workflow.py", line 380, in _run_workflow
    raise WorkflowTimeoutError(msg)
llama_index.core.workflow.errors.WorkflowTimeoutError: Operation timed out after 10.0 seconds
dosubot[bot] commented 4 hours ago

The issue you're experiencing with the WorkflowTimeoutError in your nested workflow setup on LlamaDeploy seems to be related to the default timeout setting. The error message indicates that the timeout is occurring after 10.0 seconds, which is the default timeout for workflows in LlamaDeploy [1].

Here are a few things to check and try:

  1. Ensure Timeout Override: Make sure that the timeout settings you specified for both the InnerWorkflow and OuterWorkflow are correctly applied. It seems like the default timeout of 10.0 seconds is being used instead of your specified timeouts. Double-check that the timeout parameter is being correctly passed and applied when you instantiate your workflows.

  2. Check add_workflows Method: When you add the InnerWorkflow to the OuterWorkflow using outer.add_workflows(inner=InnerWorkflow()), ensure that the instance of InnerWorkflow being added has the correct timeout setting. You might want to pass the already configured inner instance instead of creating a new one with InnerWorkflow().

  3. Deployment Configuration: Verify that the deployment configuration for both workflows is correct and that there are no issues with how the workflows are being deployed or how the AsyncLlamaDeployClient is interacting with them.

  4. Debugging: Add logging or print statements to confirm that the timeout values are being set as expected when the workflows are initialized and deployed.

By ensuring that the correct timeout values are being applied and that the InnerWorkflow is properly configured when added to the OuterWorkflow, you should be able to resolve the timeout issue.

To continue talking to Dosu, mention @dosu.

tituslhy commented 4 hours ago

@dosu Sorry I made a typo. This was how I added the inner workflow

inner = InnerWorkflow(timeout=1200.0,verbose=True)
outer = OuterWorkflow(timeout=6000.0, verbose = True)
outer.add_workflows(inner=inner)
tituslhy commented 4 hours ago

I went to the raw file of LlamaIndex's workflow.py on my machine and changed the timeout settings. It works now. This means that for some reason, the nested workflow is still only calling on a default setting instead of registering my provided timeout setting.

dosubot[bot] commented 4 hours ago

Hey @tituslhy! It looks like you're sharing some code details, but I'm just a bot here to help with bugs, technical questions, and engineering knowledge. If you think I should have more to say on this, feel free to reach out to the Dosu Team. They'll know what to do! 😊

To continue talking to Dosu, mention @dosu.