microsoft / TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.
https://microsoft.github.io/TaskWeaver/
MIT License
5.19k stars 661 forks source link

I have created an application with 4 plugins and Planner and CodeInterpreter, There are few issues i am facing here #144

Closed NisaarAgharia closed 5 months ago

NisaarAgharia commented 8 months ago
  1. Code-Heavy Operations in Planner and CodeInterpreter Problem: The current implementation for tasks like "create a summary of Information X in a particular format" is excessively reliant on code. The system uses specific code structures to achieve the desired format, despite being an LLM (Large Language Model) capable of generating formatted text directly. Furthermore, an external library is employed for summarization tasks.

Seeking Suggestions: Is there a more efficient way to leverage the LLM's native capabilities for formatting, thus reducing the code overhead? Could the external library for summarization be replaced or optimized for better performance?

  1. Latency Due to Excessive Interaction Problem: There's a noticeable latency issue, primarily caused by frequent back-and-forth communication between the Planner and CodeInterpreter. Additionally, functions like Init_plan, Plan, and Steps are always printed in the output, contributing to clutter.

Query: Are there known methods to reduce this inter-component communication? Can the output of functions like Init_plan, Plan, and Steps be minimized or made more efficient?

  1. Inadequacy of Using CodeInterpreter Alone Observation: In an attempt to address these issues, I tried removing the Planner component and solely using CodeInterpreter. However, this approach proved insufficient for our needs.

  2. Overall Speed Improvement Primary Goal: Enhancing the system's speed and efficiency is my main objective.

  3. If I use Taskweaver as a library then how can show the Graphs on the Frontend? or pull data from the respective session folder? Request for Help: I would appreciate any advice, best practices, or insights on optimizing the speed of such systems. Specific recommendations or examples of similar implementations would be highly beneficial.

liqul commented 8 months ago

@NisaarAgharia

  1. I didn't quite get the point. Do you want to implement a plugin to summarize information given something like a file? Which part do you mean dealing with the formatting of the summary? What is the external lib?

2 and 4. There are indeed multiple interactions between the planner and the code interpreter, and it is designed for handle complex tasks that need more than one steps to accomplish. If a task has 3 steps, the system needs to call the LLM for about 6 times which is hard to reduce. We indeed tried to save some tokens by minimizing the init_plan, plan, if they are the same with previous. But this is not controllable (generated by the LLM), and have negative impact to the planning quality. That is why we are conservative on changing it now. We are still exploring other opportunities to speed it up. But to be honest, if your task cannot be done in one step, the planner is still necessary and hard to reduce the latency.

  1. We do have this CodeInterpreter only mode to bypass the planning process. As discussed above, it all depends on the complexity of the task.

  2. You can take a look at the implemention of app.py in playground/UI/ folder. This is an example connecting Taskweaver with Chainlit for displaying various artifacts on the frontend.

NisaarAgharia commented 8 months ago

Thanks for the advice and quick response

Circling back to point 1 For Example: I have provided a query that involves fetching data from DB and then Summarizing. The Code interpreter Does the job of fetching data from the Database. now it should summarize the info with an LLM only, but instead planner passed the Task again to the Code interpreter and the interpreter started using code to summarize it. in one instance it started using the Gensim library to summarize it. Where the behavior I want is to summarize the output using LLM's capability rather than the interpreter writing code.

For point 3. Can we add basic planning capabilities inside the code interpreter as it has a thought component and how can that planning capability be matured inside the Only Interpreter mode?

For Point 4. Also, I have checked the Chainlit application. i wanted to do this with React, what are some ways i can extract csv,graphs, from the Session/CWD folder. I am trying to understand, how can a frontend get links to the CSV and Graph and display it in the front end.

liqul commented 8 months ago

Thanks for the advice and quick response

Circling back to point 1 For Example: I have provided a query that involves fetching data from DB and then Summarizing. The Code interpreter Does the job of fetching data from the Database. now it should summarize the info with an LLM only, but instead planner passed the Task again to the Code interpreter and the interpreter started using code to summarize it. in one instance it started using the Gensim library to summarize it. Where the behavior I want is to summarize the output using LLM's capability rather than the interpreter writing code.

Response: Oh, I see. The default behavior of the planner is only to send tasks to the CI, and this is logically simpler. But I see your point that, in this specific case, the planner should do the summarization itself which is more natural. A workaround is to add an example for the planner as described here. So this can guide the planner to work as expected. You don't need to make a real example (e.g., with one or two sentences) as that can cost some tokens in the prompt. Keep in mind that this is to teach the planner how to make decision receiving message from the code interpreter, instead of sending the data from db to CI, do the summarization itself.

For point 3. Can we add basic planning capabilities inside the code interpreter as it has a thought component and how can that planning capability be matured inside the Only Interpreter mode?

Response: Good observation! The code interpreter indeed has its own planning capability to generate code. As I said previously, it all depends on the complexity of the task. If you know that the task can be done within a single code snippet, I would recommend you try the planner.skip_planning configuration. In this mode, the planner will directly pass the user's request to the CI. After receiving the execution result from CI, the planner will summarize the result into natural language response. Let me know if this can help in your situation.

For Point 4. Also, I have checked the Chainlit application. i wanted to do this with React, what are some ways i can extract csv,graphs, from the Session/CWD folder. I am trying to understand, how can a frontend get links to the CSV and Graph and display it in the front end.

Response: I'm not a React expert so may not know the details. You can check app.py from line 365 where we are getting the artifacts and displaying them in Chainlit.

jacobbridges commented 8 months ago

@NisaarAgharia, referring to your fifth point --

If I use Taskweaver as a library then how can show the Graphs on the Frontend? or pull data from the respective session folder?

Here is a working example of calling taskweaver as a library to generate a file, then moving the file to somewhere else:

from pathlib import Path

from taskweaver.app import TaskWeaverApp
from taskweaver.memory.attachment import AttachmentType

# Instantiate a taskweaver session programmatically
taskweaver = TaskWeaverApp(app_dir="./project/")
session = taskweaver.get_session()

# Perform some task that will generate some artifacts:
round = session.send_message("Generate 20 random numbers and plot them on a bar chart.")

# Get the artifact paths
artifact_paths = [
    (Path(session.execution_cwd) / Path(p))
    for p in r.post_list
    for a in p.attachment_list
    if a.type == AttachmentType.artifact_paths
    for p in a.content
]

# Move all artifacts to another folder
media_dir_for_react_app = Path("some/path/to/frontend/media/dir")
for artifact_path in artifact_paths:
    artifact_path.rename(media_dir_for_react_app / artifact_path.stem)

@liqul, this is a separate issue but I think having a list of artifact files available from Session would be very convenient.

NisaarAgharia commented 7 months ago

So I tried the Suggestions given above and here are my learnings :

For Point 1 : Updating the Planner Prompt , the Planner now takes decision whether it can do the summarization itself or should it pass the request to code interpreter

For point 3 : This doesnt work in my case as the tasks are a little complex and it needs planner .

There are few other things i tried to make the system fast , I used GPT3.5 for planner and CI , but it failed miserably. and had planning issues , following instructions , parsing issues, etc.

I am yet to explore the point 4.

liqul commented 7 months ago

@liqul, this is a separate issue but I think having a list of artifact files available from Session would be very convenient.

@jacobbridges I agree that would be more convenient. I guess we can provide a function that also do the same scan across all the posts :)

liqul commented 7 months ago

So I tried the Suggestions given above and here are my learnings :

For Point 1 : Updating the Planner Prompt , the Planner now takes decision whether it can do the summarization itself or should it pass the request to code interpreter

I agree with your thoughts. It would be an effective solution to tune the planner's prompt if you would like it to extend its responsibility. I think adding the summarization capability would not affect its own job of making plans.

For point 3 : This doesnt work in my case as the tasks are a little complex and it needs planner .

There are few other things i tried to make the system fast , I used GPT3.5 for planner and CI , but it failed miserably. and had planning issues , following instructions , parsing issues, etc.

We tested using GPT3.5 quite a lot. The issue with GPT3.5 is that it usually fails to follow the instructions in generating response in the right format. For example, it may not generate the send_to field or the message field. Another issue is its code generation is much worse compared to GPT4, except that you only need very simple code generation.

I am yet to explore the point 4.

liqul commented 5 months ago

close inactive issues