VRSEN / agency-swarm

The only reliable agent framework built on top of the latest OpenAI Assistants API.
https://vrsen.github.io/agency-swarm/
MIT License
2.75k stars 730 forks source link

Fails to read/access files(csv, pdf, etc...) #161

Closed Ruhul-Quddus-Tamim closed 2 months ago

Ruhul-Quddus-Tamim commented 4 months ago
Screenshot 2024-07-25 at 10 59 09 PM

Agency Swarm is great! But it's quite frustrating sometimes that it fails to access the uploaded files from the storage, although it has the correct filepath's name that passed to custom tools to read/access but it fails to find the file by the custom tool. Instead of developing your own tool, if you just upload the file from the Gradio UI and just run it using Code Interpreter it works pretty well.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 30 days with no activity. Please upgrade to the latest version and test it again.

bonk1t commented 2 months ago

Hi @Ruhul-Quddus-Tamim,

Thank you for raising this issue!

The reason you're able to use Code Interpreter successfully is that the files you upload through the Gradio UI are sent to OpenAI servers, allowing remote access. To enable local tools to use uploaded files, you can modify the Agency class to handle file uploads and store them locally in a predictable location. Below is a simplified example.

Create ExtendedAgency.py:

import shutil
from pathlib import Path
from agency_swarm import Agency
from agency_swarm.threads import Thread

class ExtendedAgency(Agency):
    def custom_demo(self, height=450, dark_mode=True, **kwargs):
        """
        Launches a Gradio-based demo interface for the agency chatbot.
        """

        try:
            import gradio as gr
        except ImportError:
            raise Exception("Please install gradio: pip install gradio")

        message_file_ids = []
        message_file_names = None
        recipient_agents = [agent.name for agent in self.main_recipients]
        recipient_agent = self.main_recipients[0]

        # Handle file uploads
        def handle_files_upload(file, save_folder):
            """Saves uploaded files to a predictable local folder."""

            folder_path = Path(save_folder)
            folder_path.mkdir(parents=True, exist_ok=True)

            if file:
                try:
                    filename = Path(file).name
                    with open(file, "rb") as src_file:
                        with open(folder_path / filename, "wb") as dest_file:
                            shutil.copyfileobj(src_file, dest_file)
                    print(f"File uploaded: {filename}")
                    return str(folder_path / filename)  # Return local file path for tools
                except Exception as e:
                    print(f"Error during file upload: {e}")
                    return str(e)

            return "No file uploaded"

        # Gradio interface
        with gr.Blocks() as demo:
            chatbot = gr.Chatbot(height=height)
            with gr.Row():
                with gr.Column(scale=5):
                    msg = gr.Textbox(label="Your Message", lines=6)
                    button = gr.Button(value="Send", variant="primary")
                with gr.Column(scale=1):
                    file_upload = gr.File(label="Upload File", file_count="single")

            # Linking upload logic
            file_upload.change(lambda file: handle_files_upload(file, "./uploads"), file_upload)

            demo.queue()

        # Launch the demo
        demo.launch(**kwargs)
        return demo

    def _init_threads(self):
        """Initializes threads for communication between agents."""
        self.main_thread = Thread(self.user, self.ceo)

Use ExtendedAgency in your agency.py:

from MyAgency.ExtendedAgency import ExtendedAgency  # adjust the import 

# Instantiate the agency
agency = ExtendedAgency(
    [your_agent, ...],
    shared_instructions="./agency_manifesto.md",  # shared instructions for all agents
    max_prompt_tokens=25000,  # default tokens in conversation for all agents
    temperature=0.0,  # default temperature for all agents
)

# Start the Gradio demo
demo = agency.custom_demo(server_name="0.0.0.0", server_port=8000)

Key Points:

  1. File Upload Logic: The handle_files_upload function ensures that the uploaded files are saved locally to a predictable folder (./uploads), making it easy for your local tools to access the files by referencing the returned file paths.

  2. Gradio Integration: The custom_demo method integrates Gradio for user interaction, including a file upload component that links directly to the file upload logic.

  3. Instantiating the Agency: You can instantiate the ExtendedAgency and start the Gradio demo using the provided code snippet. This sets up your agents and runs the web interface locally.

Let me know if you need any further clarification or assistance!

Ruhul-Quddus-Tamim commented 2 months ago

Thanks @bonk1t ! That helps alot