microsoft / TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.
https://microsoft.github.io/TaskWeaver/
MIT License
5.35k stars 680 forks source link

Tools being called from ext_roles #440

Open sundar7D0 opened 1 week ago

sundar7D0 commented 1 week ago

Can tools (tool-calling of OpenAI) can be configured and called from each role (code-interpreter can use code but other new ext_roles having access to tool will be very very useful). Also if other roles use the same CodeExecutor as CodeInterpreter, will they be sharing same jupyter kernel (which will be a useful feature)?

@liqul eager to hear your thoughts

liqul commented 4 days ago

Sorry for the late response.

I would like to understand deeper on why you think enabling other roles having access to tools 'very' useful. I have this blog to explain why we introduce the concept of role as complementory to the Code Interpreter, which is more from the perspective of efficiency. In other word, having only the Code Interpreter + Plugins is sufficient for most of our target scenarios if efficiency is not a critical problem.

So, could you explain a bit for your target scenario?

sundar7D0 commented 4 days ago

Yes, I have read the other blog on roles. Despite that, each role having tools would ensure each role can be an independent agent with a ReAct loop, calling tools, executing them, and so on. Code-interpreter makes sense but going back to planner and then code interpreter everytime, even for simple functions some/many times makes the system unnecessarily high latency and unreliable.

liqul commented 4 days ago

I think, based on what you described, that you are asking for a code interpreter that can react itself, not relying on the communication between the code interpreter and the planner.

I view this as a different design choice. Basically, in the current design, we have two roles interact with each other to achieve the react loop. Our idea was to decouple the planning and execution to two roles that can better be focused on their jobs, respectively. Though it is possible to merge them into one, which can save tokens and time, the prompt is more complicated, and we are unsure whether this would be a good choice facing complex tasks that have many steps. Now, we have the planner and the code generator have their individual prompts that are easiler to maintain.

Therefore, to me, the key is not about whether the ext roles can call tools, but the role can complete a full react cycle.

sundar7D0 commented 4 days ago

That's true. Planner<-> Code interpreter loop makes sense. But for simple tool-calls like calling a search-tool to get results or choosing to call one source vs another inside a role can be done just through tool-calling inside the role instead of going back to planner and then to code-interpreter which increases the prompt count.

liqul commented 1 day ago

Agree. Due to the limited resource, we had to make hard decisions on balancing different parts. We keep reliable task execution as top of mind, not reducing the prompt count (also the latency). We envision that the cost and latency will be significantly reduced in the future.

We provided some solutions like the CodeInterpreter-only mode (though no ReAct capability) and different roles using different models. These are trying to reduce the latency a little bit.