culture-dao / chainlit-template

Chainlit OAI Assistants Template (COAT)
MIT License
4 stars 1 forks source link

Dynamic file maps #10

Open dahifi opened 3 months ago

dahifi commented 3 months ago

This is mostly AFGE related stuff, and we're going to get rid of this in lieu of the new datastores

dahifi commented 3 months ago

This will be handled in a subclass of the OAIObjectHandler class that we are using for assistants and datastores.

dahifi commented 2 months ago

This is a simplified domain map of the objects that we're dealing with.

classDiagram
    class Assistant {
        +String id
        +String object
        +Int created_at
        +String name
        +String description
        +String model
        +String instructions
        +Tool[] tools
        +Metadata metadata
        +Float top_p
        +Float temperature
        +String response_format
    }

    class Tool {
        +String type
    }

    class VectorStore {
        +String id
        +String object
        +Int created_at
        +Int usage_bytes
        +Int last_active_at
        +String name
        +String status
        +Metadata metadata
        +Int last_used_at
    }

    class VectorStoreFile {
        +String id
        +String object
        +Int usage_bytes
        +Int created_at
        +String vector_store_id
        +String status
        +String last_error
        +String file_id
    }

    class File {
        +String id
        +String object
        +Int bytes
        +Int created_at
        +String filename
        +String purpose
    }

    class Message {
        +String id
        +String object
        +Int created_at
        +String content
        +Annotation[] annotations
    }

    class Annotation {
        +String id
        +String object
        +Int created_at
        +String file_id
    }

    Assistant "1" -- "0..128" Tool
    Tool <|-- VectorStore
    VectorStore "1" -- "0..10000" VectorStoreFile : contains
    VectorStoreFile "*" -- "1" File : maps to
    Message "1" -- "*" Annotation : contains
    Annotation "*" -- "1" File : maps to
    Tool "1" -- "*" Annotation : generates
dahifi commented 2 months ago

Immediate Goals and System Requirements

User-Facing Goals:

  1. Load an Assistant and Display Attached Tools and Files:
    • Provide visibility on the tools and files attached to a specific assistant.

Admin Goals:

  1. Identify and Clean Orphan Files:

    • Determine which files are not associated with any tools, vector stores, or assistants, and clean them out.
  2. Maintain Up-to-Date Codexes:

    • Ensure the files and vector store files used by tools are current and properly versioned.

Bringing Everything Together

Let's outline the necessary steps and components to achieve these goals:

1. Extend OAIHandler and Subclasses

Ensure we have an OAIHandler class and appropriate subclasses for managing CRUD operations:

class OAIHandler:
    def __init__(self, api_key):
        self.api_key = api_key

    def create(self, endpoint, data):
        # Implement create operation
        pass

    def read(self, endpoint, object_id):
        # Implement read operation
        pass

    def update(self, endpoint, object_id, data):
        # Implement update operation
        pass

    def delete(self, endpoint, object_id):
        # Implement delete operation
        pass

class AssistantHandler(OAIHandler):
    def __init__(self, api_key):
        super().__init__(api_key)

    # Additional methods specific to assistants

class VectorStoreHandler(OAIHandler):
    def __init__(self, api_key):
        super().__init__(api_key)

    # Additional methods specific to vector stores

class VectorStoreFileHandler(OAIHandler):
    def __init__(self, api_key):
        super().__init__(api_key)

    # Additional methods specific to vector store files

class FileHandler(OAIHandler):
    def __init__(self, api_key):
        super().__init__(api_key)

    # Additional methods specific to files

2. Load an Assistant and Display Attached Tools and Files

Create a function to load an assistant and display its attached tools and files:

def load_assistant(assistant_id):
    assistant_handler = AssistantHandler(api_key='your_api_key')
    vector_store_handler = VectorStoreHandler(api_key='your_api_key')
    file_handler = FileHandler(api_key='your_api_key')

    # Load assistant
    assistant = assistant_handler.read('assistants', assistant_id)

    # Get attached tools
    tools = assistant['tools']

    attached_files = []
    for tool in tools:
        if tool['type'] == 'vector_store':
            vector_store = vector_store_handler.read('vectorstores', tool['id'])
            vector_store_files = vector_store['files']
            for vsf in vector_store_files:
                file = file_handler.read('files', vsf['file_id'])
                attached_files.append(file)

    return assistant, tools, attached_files

3. Identify and Clean Orphan Files

Create a function to identify and clean orphan files:

def clean_orphan_files():
    file_handler = FileHandler(api_key='your_api_key')
    vector_store_file_handler = VectorStoreFileHandler(api_key='your_api_key')

    all_files = file_handler.list('files')
    all_vector_store_files = vector_store_file_handler.list('vectorstorefiles')

    # Identify files not referenced by any vector store file
    orphan_files = [file for file in all_files if not any(vsf['file_id'] == file['id'] for vsf in all_vector_store_files)]

    # Clean orphan files
    for orphan in orphan_files:
        file_handler.delete('files', orphan['id'])

    return orphan_files

4. Maintain Up-to-Date Codexes

Ensure files and vector store files are current and properly versioned:

def update_codexes():
    file_handler = FileHandler(api_key='your_api_key')
    vector_store_file_handler = VectorStoreFileHandler(api_key='your_api_key')

    # Implement logic to check and update codexes
    # For example, checking file hashes and re-uploading if necessary
    pass

Task List

Here are the tasks to complete the project:

- [ ] Extend OAIHandler class for Assistant, VectorStore, VectorStoreFile, and File operations. 📅 2024-06-01
- [ ] Implement load_assistant function to display attached tools and files. 📅 2024-06-02
- [ ] Implement clean_orphan_files function to identify and remove orphan files. 📅 2024-06-03
- [ ] Implement update_codexes function to maintain up-to-date files and vector store files. 📅 2024-06-04
- [ ] Integrate all components and test the system. 📅 2024-06-05

This plan ensures we have a clear strategy for managing our assistants, tools, and files, providing visibility and maintaining data integrity across the system. Let me know if you need any further adjustments!