Open dexhorthy opened 2 months ago
Hey @dexhorthy.
A resounding :hell_yeah: from us over here. You've nailed the two blessed ways at the moment: pause/resume + make a tool. The former is nice because it lets you sleep agents until they're ready, but the latter is more ergonomic IMO and let's you lean into writing cleaner prompts.
Would love to find a way to make these two easier, where you can sleep an agent (sleeper agent!??!?) workflow until its tool result comes back in.
one additional bit of detail here as we're thinking through this - pause/suspend might work okay if you get a response in under an hour (after which the timeout hits), but the natural way I had implemented this was
flow -> AI task -> tool "get_confirmation_in_slack" -> slack -> set Variable
mapping slack_msg_id
to flow_run_id
, then pause flow with wait_for_input
slack webhook -> fastapi -> lookup flow_run_id
with inbound slack_msg_id
of the slack thread parent, send_input
to the paused flow
the issue there is that you end up with an error because you can't pause a flow if there's a TaskRunContext
2024-07-24 13:08:12,370 - prefect.task_runs - ERROR - Finished in state Failed('Task run encountered an exception RuntimeError: Cannot pause task runs.')
I am currently trying to to rework this a bit
flow -> AI task -> slack, set msg_id flow -> pause
slack webhook -> fastapi -> lookup flow_run_id
as before, resume
the paused flow
but the magic of "use the llm to do the reasoning" is a little lost in this case. can't really explain why I want this, maybe the "tool method" is just as you said, "more ergonomic". It eliminates a lot of cognitive overhead in the implementation to tell an LLM "here's a tool that you can use to ask for confirmation/input from a backend user" where the tool itself returns the response from the user, and the LLM can even make decisions about when to ask for approval vs. skip that step based on completeness of context, etc.
Among other things, doing the pause resume at the flow level, outside the LLM, bubbles implementation details down from inside the tool all the way up to the flow. Before, one or more "confirmation" implementations could be neatly bundled up in a modular tool and glued arbitrarily to the webhook receiver endpoint (e.g. before exploring flow pause/resume, we did this with an in-memory Queue that the flow, tasks, and llm knew nothing about)
Enhancement Description
I'm look to understand the best way to bring the User into a conversation where the interaction is not a CLI / CLI tool. I can think of a few workaround-ish ideas that might work:
tell_user
orask_user_for_clarification
that handles the IO via a websocket or somethinguser_input=True
on a task, capture/forward stdin/stdout 😬Use Case
Building autonomous agents for data engineering and data product management -
The interaction paradigms I'd like to be able to support include web-app chat via websockets or other async layer, in addition to more "outer loop" type channels like email, slack, sms, etc. For example, an agent might discover something and want to alert a user, or might complete a long-running task for a user and want to get the user's input.
These sorts of workflows fit nicely into a D(A)G-y sort of state machine that prefect enables, but I'm trying to wrap my head around the best way to fit together these sorts of async and multi-player workflows, or even just some workarounds / patterns that have worked well for applications that have access to LLMs
Proposed Implementation